pith. sign in

arxiv: 2606.01482 · v1 · pith:U4LGYFAHnew · submitted 2026-05-31 · 💻 cs.CL

Beyond Topical Similarity: Contrastive Evidence Retrieval with Interpretable Attention Alignment in RAG

Pith reviewed 2026-06-28 16:54 UTC · model grok-4.3

classification 💻 cs.CL
keywords RAGcontrastive learningattention alignmenthard negative selectioninterpretabilityclinical trialsevidence retrieval
0
0 comments X

The pith

A contrastive retriever trained with human rationales aligns attention to factual evidence in RAG.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces CERA to train dense retrievers using contrastive learning on subjectivity-based hard negatives and an auxiliary loss that aligns the model's attention to human-annotated factual rationales using part-of-speech-weighted masking. This is evaluated on a large corpus of clinical trial reports, showing improved retrieval effectiveness over baselines and better faithfulness of attention as explanations. The central goal is to enable retrieval that identifies specific evidence tokens rather than just topical similarity, which would make RAG systems more reliable and interpretable. A reader would care because current RAG often suffers from selecting non-evidential but topically similar text, leading to potential hallucinations.

Core claim

CERA fine-tunes a dense retriever using two training objectives: triplet-based contrastive learning and interpretable attention alignment, which supervises CLS-to-token attention using a part-of-speech-weighted masking distribution over human-annotated factual rationales as evidence signals. Experiments demonstrate that the subjectivity-based hard negative selection substantially improves retrieval effectiveness compared to both Contriever and hard negative selection baselines, and rationale alignment improves faithfulness while maintaining competitive retrieval performance.

What carries the argument

The auxiliary attention alignment loss that supervises CLS-to-token attention with a masking distribution derived from human-annotated factual rationales.

Load-bearing premise

That human-annotated factual rationales, when turned into a part-of-speech-weighted masking distribution, provide a reliable and generalizable supervision signal for forcing CLS-to-token attention to align with evidence.

What would settle it

If an ablation removing the attention alignment loss on the clinical trial reports shows no change in attention overlap with the human rationales or in faithfulness metrics, the value of the alignment component would be falsified.

Figures

Figures reproduced from arXiv: 2606.01482 by Ameeta Agrawal, Andr\'e Freitas, Daniel Pedronette, Diego Alves, Francielle Vargas, Jo\~ao Robiatti, Lucas Pascotti Valem, Maximilian Seeth, Sebasti\'an Ferrada.

Figure 1
Figure 1. Figure 1: Contrastive Evidence Rationale Attention [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Hierarchical clustering based on lemmas (left) and heatmaps showing pairwise co-occurrence of lemmas [PITH_FULL_IMAGE:figures/full_fig_p014_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Prompt used for LLM-as-a-judge evaluation of retrieved spans against gold-standard references. The first [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗
read the original abstract

Ensuring factuality and interpretability in RAG remains an open and urgent problem. We introduce Contrastive Evidence Rationale Attention (CERA), the first retrieval framework to employ subjectivity-based hard negative selection and inject an evidential inductive bias into contrastive learning through an auxiliary attention alignment loss. CERA fine-tunes a dense retriever using two training objectives: triplet-based contrastive learning and interpretable attention alignment, which supervises CLS-to-token attention using a part-of-speech-weighted masking distribution over human-annotated factual rationales as evidence signals. Experiments on a large corpus of clinical trial reports demonstrate that the subjectivity-based hard negative selection substantially improves retrieval effectiveness compared to both Contriever and hard negative selection baselines. Furthermore, rationale alignment improves faithfulness while maintaining competitive retrieval performance, supporting the hypothesis that attention can serve as a more faithful explanation of model behavior when guided by human rationales. Moving beyond topical similarity, CERA enables the retriever to identify the specific tokens that constitute supporting evidence, promoting more interpretable evidence selection in RAG systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces Contrastive Evidence Rationale Attention (CERA), a dense retriever fine-tuning method that augments standard contrastive (triplet) learning with subjectivity-based hard negative selection and an auxiliary attention-alignment loss. The auxiliary loss constructs a target distribution over tokens by applying part-of-speech weighting to human-annotated factual rationales and supervises the CLS-to-token attention of the retriever to align with this distribution. Experiments on a clinical-trial-report corpus are claimed to show that the hard-negative component improves retrieval effectiveness over Contriever and standard hard-negative baselines, while the rationale-alignment component improves faithfulness metrics without degrading retrieval performance.

Significance. If the reported gains are robust and the annotation-derived supervision generalizes, the work supplies a concrete mechanism for moving retrieval beyond topical similarity toward evidence-specific token selection, which could improve both effectiveness and interpretability in RAG pipelines. The explicit coupling of human rationales to attention alignment is a distinctive inductive bias that is not present in prior contrastive retrievers.

major comments (2)
  1. [Method (attention alignment loss)] Method section (attention alignment loss): the target distribution is derived from human-annotated rationales via POS-weighted masking; the manuscript provides neither inter-annotator agreement statistics nor an ablation that isolates the masking distribution from the contrastive objective, leaving the central claim that this produces more faithful explanations dependent on an unverified premise about annotation quality and consistency.
  2. [Experiments] Experiments section: the abstract asserts that subjectivity-based hard negative selection 'substantially improves retrieval effectiveness' and that rationale alignment 'improves faithfulness,' yet the provided text supplies no quantitative metrics, dataset sizes, error bars, or statistical tests; without these the empirical support for the two central claims cannot be evaluated.
minor comments (1)
  1. [Abstract] The abstract would be strengthened by including at least one key quantitative result (e.g., nDCG@10 or faithfulness delta) rather than purely qualitative statements.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive comments. We address each major comment point by point below and indicate where revisions will be made.

read point-by-point responses
  1. Referee: [Method (attention alignment loss)] Method section (attention alignment loss): the target distribution is derived from human-annotated rationales via POS-weighted masking; the manuscript provides neither inter-annotator agreement statistics nor an ablation that isolates the masking distribution from the contrastive objective, leaving the central claim that this produces more faithful explanations dependent on an unverified premise about annotation quality and consistency.

    Authors: We acknowledge that inter-annotator agreement statistics are absent because the rationales were collected from a single annotator. We will add an ablation that removes the POS weighting from the target distribution while keeping the contrastive objective fixed, to isolate its effect on faithfulness metrics. These results will be added to the revised method and experiments sections. revision: partial

  2. Referee: [Experiments] Experiments section: the abstract asserts that subjectivity-based hard negative selection 'substantially improves retrieval effectiveness' and that rationale alignment 'improves faithfulness,' yet the provided text supplies no quantitative metrics, dataset sizes, error bars, or statistical tests; without these the empirical support for the two central claims cannot be evaluated.

    Authors: The current manuscript text does not embed the quantitative results, dataset sizes, error bars, or statistical tests directly in the narrative. We will revise the experiments section to include these details explicitly (e.g., corpus size, nDCG/Recall values with standard deviations, and paired significance tests) so that the support for both claims is fully evaluable. revision: yes

standing simulated objections not resolved
  • Inter-annotator agreement statistics for the rationale annotations (single-annotator collection process)

Circularity Check

0 steps flagged

No significant circularity; method uses external human annotations and standard objectives

full rationale

The paper introduces CERA via triplet contrastive loss plus an auxiliary attention-alignment loss whose target is constructed from external human-annotated factual rationales (via POS-weighted masking). No equations, fitted parameters, or self-citations are shown to reduce any claimed prediction or faithfulness gain to a quantity defined inside the paper itself. The central claims rest on experimental comparison against Contriever and hard-negative baselines on an external clinical-trial corpus, not on any self-definitional or self-referential construction. This is the normal non-circular case for a supervised retrieval method whose supervision signal originates outside the model.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review performed on abstract only; no explicit free parameters, axioms, or invented entities are stated. The method implicitly assumes that human rationales are high-quality evidence signals and that attention weights can be meaningfully supervised.

pith-pipeline@v0.9.1-grok · 5747 in / 1276 out tokens · 19422 ms · 2026-06-28T16:54:10.209357+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

66 extracted references · 44 canonical work pages · 1 internal anchor

  1. [1]

    Aho and Jeffrey D

    Alfred V. Aho and Jeffrey D. Ullman , title =. 1972

  2. [2]

    Mitchell , title =

    Tom M. Mitchell , title =. 1980 , url =

  3. [3]

    Publications Manual , year = "1983", publisher =

  4. [4]

    Chandra and Dexter C

    Ashok K. Chandra and Dexter C. Kozen and Larry J. Stockmeyer , year = "1981", title =. doi:10.1145/322234.322243

  5. [5]

    Sparse Latents Steer Retrieval-Augmented Generation

    Xin, Chunlei and Zhou, Shuheng and Zhu, Huijia and Wang, Weiqiang and Chen, Xuanang and Guan, Xinyan and Lu, Yaojie and Lin, Hongyu and Han, Xianpei and Sun, Le. Sparse Latents Steer Retrieval-Augmented Generation. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.ac...

  6. [6]

    2019 , eprint=

    Attention Interpretability Across NLP Tasks , author=. 2019 , eprint=

  7. [7]

    Is Attention Interpretable?

    Serrano, Sofia and Smith, Noah A. Is Attention Interpretable?. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. doi:10.18653/v1/P19-1282

  8. [8]

    Proceedings of the AAAI Conference on Artificial Intelligence , author=

    Aligning Attention with Human Rationales for Self-Explaining Hate Speech Detection , volume=. Proceedings of the AAAI Conference on Artificial Intelligence , author=. 2026 , month=. doi:10.1609/aaai.v40i44.41069 , abstractNote=

  9. [9]

    Findings of the Association for Computational Linguistics: ACL 2026 , pages =

    Self-Explaining Hate Speech Detection with Moral Rationales , author =. Findings of the Association for Computational Linguistics: ACL 2026 , pages =. 2026 , url =

  10. [10]

    Improving the Faithfulness of Attention-based Explanations with Task-specific Information for Text Classification

    Chrysostomou, George and Aletras, Nikolaos. Improving the Faithfulness of Attention-based Explanations with Task-specific Information for Text Classification. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021. doi:...

  11. [11]

    The Annals of Mathematical Statistics , volume=

    On Information and Sufficiency , author=. The Annals of Mathematical Statistics , volume=. 1951 , publisher=

  12. [12]

    Attention is not not Explanation

    Wiegreffe, Sarah and Pinter, Yuval. Attention is not not Explanation. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019. doi:10.18653/v1/D19-1002

  13. [13]

    ERASER : A Benchmark to Evaluate Rationalized NLP Models

    DeYoung, Jay and Jain, Sarthak and Rajani, Nazneen Fatema and Lehman, Eric and Xiong, Caiming and Socher, Richard and Wallace, Byron C. ERASER : A Benchmark to Evaluate Rationalized NLP Models. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020. doi:10.18653/v1/2020.acl-main.408

  14. [14]

    RE - RAG : Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation

    Kim, Kiseung and Lee, Jay-Yoon. RE - RAG : Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. doi:10.18653/v1/2024.emnlp-main.1236

  15. [15]

    Scalable training of

    Andrew, Galen and Gao, Jianfeng , booktitle=. Scalable training of

  16. [16]

    Dan Gusfield , title =. 1997

  17. [17]

    Tetreault , title =

    Mohammad Sadegh Rasooli and Joel R. Tetreault , title =. Computing Research Repository , volume =. 2015 , url =

  18. [18]

    A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =

    Ando, Rie Kubota and Zhang, Tong , Issn =. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =. Journal of Machine Learning Research , Month = dec, Numpages =

  19. [19]

    Toward Structured Knowledge Reasoning: Contrastive Retrieval-Augmented Generation on Experience

    Gu, Jiawei and Xian, Ziting and Xie, Yuanzhen and Liu, Ye and Liu, Enjie and Zhong, Ruichao and Gao, Mochi and Tan, Yunzhi and Hu, Bo and Li, Zang. Toward Structured Knowledge Reasoning: Contrastive Retrieval-Augmented Generation on Experience. Findings of the Association for Computational Linguistics: ACL 2025. 2025. doi:10.18653/v1/2025.findings-acl.1224

  20. [20]

    RAGT ruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models

    Niu, Cheng and Wu, Yuanhao and Zhu, Juno and Xu, Siliang and Shum, KaShun and Zhong, Randy and Song, Juntong and Zhang, Tong. RAGT ruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. doi:10.18653/v...

  21. [21]

    Inferring Which Medical Treatments Work from Reports of Clinical Trials

    Lehman, Eric and DeYoung, Jay and Barzilay, Regina and Wallace, Byron C. Inferring Which Medical Treatments Work from Reports of Clinical Trials. Proceedings of the 2019 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019. doi:10.18653/v1/N19-1371

  22. [22]

    Eliciting Critical Reasoning in Retrieval-Augmented Generation via Contrastive Explanations

    Ranaldi, Leonardo and Valentino, Marco and Freitas, Andre. Eliciting Critical Reasoning in Retrieval-Augmented Generation via Contrastive Explanations. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.n...

  23. [23]

    Large language models can be easily distracted by irrelevant context , year =

    Shi, Freda and Chen, Xinyun and Misra, Kanishka and Scales, Nathan and Dohan, David and Chi, Ed and Sch\". Large language models can be easily distracted by irrelevant context , year =. Proceedings of the 40th International Conference on Machine Learning , articleno =

  24. [24]

    2020 , eprint=

    How Context Affects Language Models' Factual Predictions , author=. 2020 , eprint=

  25. [25]

    Passage-based BM 25 Hard Negatives: A Simple and Effective Negative Sampling Strategy For Dense Retrieval

    Nguyen, Thanh-Do and Bui, Chi Minh and Vuong, Thi-Hai-Yen and Phan, Xuan-Hieu. Passage-based BM 25 Hard Negatives: A Simple and Effective Negative Sampling Strategy For Dense Retrieval. Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation. 2023

  26. [26]

    Moreira, Radek Osmulski, Mengyao Xu, Ronay Ak, Benedikt Schifferer, and Even Oldridge

    Moreira, Gabriel de Souza P. and Osmulski, Radek and Xu, Mengyao and Ak, Ronay and Schifferer, Benedikt and Oldridge, Even , title =. Proceedings of the 34th ACM International Conference on Information and Knowledge Management , pages =. 2025 , isbn =. doi:10.1145/3746252.3761254 , abstract =

  27. [27]

    2025 , eprint=

    Ranking Free RAG: Replacing Re-ranking with Selection in RAG for Sensitive Domains , author=. 2025 , eprint=

  28. [28]

    and Lee, Su-In , title =

    Lundberg, Scott M. and Lee, Su-In , title =. Proceedings of the 31st International Conference on Neural Information Processing Systems , pages =. 2017 , isbn =

  29. [29]

    2016 , isbn =

    Ribeiro, Marco Tulio and Singh, Sameer and Guestrin, Carlos , title =. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pages =. 2016 , isbn =. doi:10.1145/2939672.2939778 , abstract =

  30. [30]

    Learning Interpretable Legal Case Retrieval via Knowledge-Guided Case Reformulation

    Deng, Chenlong and Mao, Kelong and Dou, Zhicheng. Learning Interpretable Legal Case Retrieval via Knowledge-Guided Case Reformulation. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. doi:10.18653/v1/2024.emnlp-main.73

  31. [31]

    and Shu, Jiang and Gao, Jiaxin and Engelhard, Matthew M

    Li, Fengnan and Hill, Elliot D. and Shu, Jiang and Gao, Jiaxin and Engelhard, Matthew M. IRIS : Interpretable Retrieval-Augmented Classification for Long Interspersed Document Sequences. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.acl-long.1461

  32. [32]

    Evaluation of Attribution Bias in Generator-Aware Retrieval-Augmented Large Language Models

    Abolghasemi, Amin and Azzopardi, Leif and Hashemi, Seyyed Hadi and de Rijke, Maarten and Verberne, Suzan. Evaluation of Attribution Bias in Generator-Aware Retrieval-Augmented Large Language Models. Findings of the Association for Computational Linguistics: ACL 2025. 2025. doi:10.18653/v1/2025.findings-acl.1087

  33. [33]

    Contrastive Explanations for Model Interpretability

    Jacovi, Alon and Swayamdipta, Swabha and Ravfogel, Shauli and Elazar, Yanai and Choi, Yejin and Goldberg, Yoav. Contrastive Explanations for Model Interpretability. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021. doi:10.18653/v1/2021.emnlp-main.120

  34. [34]

    Evaluating Verifiability in Generative Search Engines

    Liu, Nelson and Zhang, Tianyi and Liang, Percy. Evaluating Verifiability in Generative Search Engines. Findings of the Association for Computational Linguistics: EMNLP 2023. 2023. doi:10.18653/v1/2023.findings-emnlp.467

  35. [35]

    In-Context Retrieval-Augmented Language Models

    Ram, Ori and Levine, Yoav and Dalmedigos, Itay and Muhlgay, Dor and Shashua, Amnon and Leyton-Brown, Kevin and Shoham, Yoav. In-Context Retrieval-Augmented Language Models. Transactions of the Association for Computational Linguistics. 2023. doi:10.1162/tacl_a_00605

  36. [36]

    Retrieval-augmented generation for knowledge-intensive NLP tasks , year =

    Lewis, Patrick and Perez, Ethan and Piktus, Aleksandra and Petroni, Fabio and Karpukhin, Vladimir and Goyal, Naman and K\". Retrieval-augmented generation for knowledge-intensive NLP tasks , year =

  37. [37]

    Improving the Factual Correctness of Radiology Report Generation with Semantic Rewards

    Delbrouck, Jean-Benoit and Chambon, Pierre and Bluethgen, Christian and Tsai, Emily and Almusa, Omar and Langlotz, Curtis. Improving the Factual Correctness of Radiology Report Generation with Semantic Rewards. Findings of the Association for Computational Linguistics: EMNLP 2022. 2022. doi:10.18653/v1/2022.findings-emnlp.319

  38. [38]

    2024 , eprint=

    Factual Serialization Enhancement: A Key Innovation for Chest X-ray Report Generation , author=. 2024 , eprint=

  39. [39]

    General Data Protection Regulation , year =. Regulation(eu) 2016/679 of the european parliament and of the council of 27 april 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46 , journal =

  40. [40]

    2022 , eprint=

    Unsupervised Dense Information Retrieval with Contrastive Learning , author=. 2022 , eprint=

  41. [41]

    Poetry in RAG s: M odern G reek interwar poetry generation using RAG and contrastive training

    Chatzikyriakidis, Stergios and Natsina, Anastasia. Poetry in RAG s: M odern G reek interwar poetry generation using RAG and contrastive training. Proceedings of the 5th International Conference on Natural Language Processing for Digital Humanities. 2025. doi:10.18653/v1/2025.nlp4dh-1.22

  42. [42]

    Contrastive Learning to Improve Retrieval for Real-World Fact Checking

    Sriram, Aniruddh and Xu, Fangyuan and Choi, Eunsol and Durrett, Greg. Contrastive Learning to Improve Retrieval for Real-World Fact Checking. Proceedings of the Seventh Fact Extraction and VERification Workshop (FEVER). 2024. doi:10.18653/v1/2024.fever-1.28

  43. [43]

    and Caliskan, Aylin

    Ghate, Kshitish and Charlesworth, Tessa and Diab, Mona T. and Caliskan, Aylin. Biases Propagate in Encoder-based Vision-Language Models: A Systematic Analysis From Intrinsic Measures to Zero-shot Retrieval Outcomes. Findings of the Association for Computational Linguistics: ACL 2025. 2025. doi:10.18653/v1/2025.findings-acl.955

  44. [44]

    ALICE : Active Learning with Contrastive Natural Language Explanations

    Liang, Weixin and Zou, James and Yu, Zhou. ALICE : Active Learning with Contrastive Natural Language Explanations. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. doi:10.18653/v1/2020.emnlp-main.355

  45. [45]

    E qualize IR : Mitigating Linguistic Biases in Retrieval Models

    Cheng, Jiali and Amiri, Hadi. E qualize IR : Mitigating Linguistic Biases in Retrieval Models. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers). 2025. doi:10.18653/v1/2025.naacl-short.75

  46. [46]

    Medical Graph RAG : Evidence-based Medical Large Language Model via Graph Retrieval-Augmented Generation

    Wu, Junde and Zhu, Jiayuan and Qi, Yunli and Chen, Jingkun and Xu, Min and Menolascina, Filippo and Jin, Yueming and Grau, Vicente. Medical Graph RAG : Evidence-based Medical Large Language Model via Graph Retrieval-Augmented Generation. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. ...

  47. [47]

    Discovering Biases in Information Retrieval Models Using Relevance Thesaurus as Global Explanation

    Kim, Youngwoo and Rahimi, Razieh and Allan, James. Discovering Biases in Information Retrieval Models Using Relevance Thesaurus as Global Explanation. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. doi:10.18653/v1/2024.emnlp-main.1089

  48. [48]

    LLM s are Biased Evaluators But Not Biased for Fact-Centric Retrieval Augmented Generation

    Chen, Yen-Shan and Jin, Jing and Kuo, Peng-Ting and Huang, Chao-Wei and Chen, Yun-Nung. LLM s are Biased Evaluators But Not Biased for Fact-Centric Retrieval Augmented Generation. Findings of the Association for Computational Linguistics: ACL 2025. 2025. doi:10.18653/v1/2025.findings-acl.1369

  49. [49]

    Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation

    Sun, Liwen and Zhao, James Jialun and Han, Wenjing and Xiong, Chenyan. Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2025. doi:10.1...

  50. [50]

    Removal of Hallucination on Hallucination: Debate-Augmented RAG

    Hu, Wentao and Zhang, Wengyu and Jiang, Yiyang and Zhang, Chen Jason and Wei, Xiaoyong and Qing, Li. Removal of Hallucination on Hallucination: Debate-Augmented RAG. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.acl-long.770

  51. [51]

    Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation

    Krishna, Satyapriya and Krishna, Kalpesh and Mohananey, Anhad and Schwarcz, Steven and Stambler, Adam and Upadhyay, Shyam and Faruqui, Manaal. Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Lan...

  52. [52]

    The TXM Platform: Building Open-Source Textual Analysis Software Compatible with the TEI Encoding Scheme

    Heiden, Serge. The TXM Platform: Building Open-Source Textual Analysis Software Compatible with the TEI Encoding Scheme. Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation. 2010

  53. [53]

    Journal of Legal Analysis , volume =

    Dahl, Matthew and Magesh, Varun and Suzgun, Mirac and Ho, Daniel E , title =. Journal of Legal Analysis , volume =. 2024 , month =. doi:10.1093/jla/laae003 , url =

  54. [54]

    Community-Informed AI Models for Police Accountability

    A multi-perspective machine learning approach to evaluate police-driver interaction in Los Angeles , author=. arXiv preprint arXiv:2402.01703 , year=

  55. [55]

    , title =

    Turpin, Miles and Michael, Julian and Perez, Ethan and Bowman, Samuel R. , title =. Proceedings of the 37th International Conference on Neural Information Processing Systems , articleno =. 2023 , publisher =

  56. [56]

    2026 , eprint=

    Can LLMs Score Medical Diagnoses and Clinical Reasoning as well as Expert Panels? , author=. 2026 , eprint=

  57. [57]

    Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space

    Geva, Mor and Caciularu, Avi and Wang, Kevin and Goldberg, Yoav. Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 2022. doi:10.18653/v1/2022.emnlp-main.3

  58. [58]

    BERT , are you paying attention? Attention regularization with human-annotated rationales

    Herrewijnen, Elize and Nguyen, Dong and Bex, Floris and Gatt, Albert. BERT , are you paying attention? Attention regularization with human-annotated rationales. Proceedings of the 19th Conference of the E uropean Chapter of the A ssociation for C omputational L inguistics. 2026. doi:10.18653/v1/2026.eacl-long.31

  59. [59]

    Contrastive learning with hard negative samples

    Robinson, Joshua and Chuang, Ching-Yao and Sra, Suvrit and Jegelka, Stefanie , year =. Contrastive. doi:10.48550/ARXIV.2010.04592 , abstract =

  60. [60]

    Karpukhin, Vladimir and Oguz, Barlas and Min, Sewon and Lewis, Patrick and Wu, Ledell and Edunov, Sergey and Chen, Danqi and Yih, Wen-tau , year =. Dense. Proceedings of the 2020. doi:10.18653/v1/2020.emnlp-main.550 , language =

  61. [61]

    Approximate

    Xiong, Lee and Xiong, Chenyan and Li, Ye and Tang, Kwok-Fung and Liu, Jialin and Bennett, Paul and Ahmed, Junaid and Overwijk, Arnold , year =. Approximate. doi:10.48550/ARXIV.2007.00808 , abstract =

  62. [62]

    S im CSE : Simple Contrastive Learning of Sentence Embeddings

    Gao, Tianyu and Yao, Xingcheng and Chen, Danqi , year =. Proceedings of the 2021. doi:10.18653/v1/2021.emnlp-main.552 , language =

  63. [63]

    Cai, Yinqiong and Guo, Jiafeng and Fan, Yixing and Ai, Qingyao and Zhang, Ruqing and Cheng, Xueqi , month = oct, year =. Hard. Proceedings of the 31st. doi:10.1145/3511808.3557343 , language =

  64. [64]

    Proceedings of the 2022

    Zhou, Kun and Gong, Yeyun and Liu, Xiao and Zhao, Wayne Xin and Shen, Yelong and Dong, Anlei and Lu, Jingwen and Majumder, Rangan and Wen, Ji-rong and Duan, Nan , year =. Proceedings of the 2022. doi:10.18653/v1/2022.emnlp-industry.56 , language =

  65. [65]

    Negative

    Wischounig, Laurin and Abdallah, Abdelrahman and Jatowt, Adam , year =. Negative. Findings of the. doi:10.18653/v1/2026.findings-eacl.157 , language =

  66. [66]

    Generating Biographies on W ikipedia: The Impact of Gender Bias on the Retrieval-Based Generation of Women Biographies

    Fan, Angela and Gardent, Claire. Generating Biographies on W ikipedia: The Impact of Gender Bias on the Retrieval-Based Generation of Women Biographies. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022. doi:10.18653/v1/2022.acl-long.586