Beyond Topical Similarity: Contrastive Evidence Retrieval with Interpretable Attention Alignment in RAG

Ameeta Agrawal; Andr\'e Freitas; Daniel Pedronette; Diego Alves; Francielle Vargas; Jo\~ao Robiatti; Lucas Pascotti Valem; Maximilian Seeth; Sebasti\'an Ferrada

arxiv: 2606.01482 · v1 · pith:U4LGYFAHnew · submitted 2026-05-31 · 💻 cs.CL

Beyond Topical Similarity: Contrastive Evidence Retrieval with Interpretable Attention Alignment in RAG

Francielle Vargas , Jo\~ao Robiatti , Diego Alves , Lucas Pascotti Valem , Maximilian Seeth , Sebasti\'an Ferrada , Ameeta Agrawal , Daniel Pedronette

show 1 more author

Andr\'e Freitas

This is my paper

Pith reviewed 2026-06-28 16:54 UTC · model grok-4.3

classification 💻 cs.CL

keywords RAGcontrastive learningattention alignmenthard negative selectioninterpretabilityclinical trialsevidence retrieval

0 comments

The pith

A contrastive retriever trained with human rationales aligns attention to factual evidence in RAG.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces CERA to train dense retrievers using contrastive learning on subjectivity-based hard negatives and an auxiliary loss that aligns the model's attention to human-annotated factual rationales using part-of-speech-weighted masking. This is evaluated on a large corpus of clinical trial reports, showing improved retrieval effectiveness over baselines and better faithfulness of attention as explanations. The central goal is to enable retrieval that identifies specific evidence tokens rather than just topical similarity, which would make RAG systems more reliable and interpretable. A reader would care because current RAG often suffers from selecting non-evidential but topically similar text, leading to potential hallucinations.

Core claim

CERA fine-tunes a dense retriever using two training objectives: triplet-based contrastive learning and interpretable attention alignment, which supervises CLS-to-token attention using a part-of-speech-weighted masking distribution over human-annotated factual rationales as evidence signals. Experiments demonstrate that the subjectivity-based hard negative selection substantially improves retrieval effectiveness compared to both Contriever and hard negative selection baselines, and rationale alignment improves faithfulness while maintaining competitive retrieval performance.

What carries the argument

The auxiliary attention alignment loss that supervises CLS-to-token attention with a masking distribution derived from human-annotated factual rationales.

Load-bearing premise

That human-annotated factual rationales, when turned into a part-of-speech-weighted masking distribution, provide a reliable and generalizable supervision signal for forcing CLS-to-token attention to align with evidence.

What would settle it

If an ablation removing the attention alignment loss on the clinical trial reports shows no change in attention overlap with the human rationales or in faithfulness metrics, the value of the alignment component would be falsified.

Figures

Figures reproduced from arXiv: 2606.01482 by Ameeta Agrawal, Andr\'e Freitas, Daniel Pedronette, Diego Alves, Francielle Vargas, Jo\~ao Robiatti, Lucas Pascotti Valem, Maximilian Seeth, Sebasti\'an Ferrada.

**Figure 2.** Figure 2: Hierarchical clustering based on lemmas (left) and heatmaps showing pairwise co-occurrence of lemmas [PITH_FULL_IMAGE:figures/full_fig_p014_2.png] view at source ↗

**Figure 3.** Figure 3: Prompt used for LLM-as-a-judge evaluation of retrieved spans against gold-standard references. The first [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗

read the original abstract

Ensuring factuality and interpretability in RAG remains an open and urgent problem. We introduce Contrastive Evidence Rationale Attention (CERA), the first retrieval framework to employ subjectivity-based hard negative selection and inject an evidential inductive bias into contrastive learning through an auxiliary attention alignment loss. CERA fine-tunes a dense retriever using two training objectives: triplet-based contrastive learning and interpretable attention alignment, which supervises CLS-to-token attention using a part-of-speech-weighted masking distribution over human-annotated factual rationales as evidence signals. Experiments on a large corpus of clinical trial reports demonstrate that the subjectivity-based hard negative selection substantially improves retrieval effectiveness compared to both Contriever and hard negative selection baselines. Furthermore, rationale alignment improves faithfulness while maintaining competitive retrieval performance, supporting the hypothesis that attention can serve as a more faithful explanation of model behavior when guided by human rationales. Moving beyond topical similarity, CERA enables the retriever to identify the specific tokens that constitute supporting evidence, promoting more interpretable evidence selection in RAG systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CERA pairs subjectivity-based hard negatives with an auxiliary loss that aligns CLS attention to human rationales, but the abstract supplies no numbers or ablations so the gains cannot be checked yet.

read the letter

The new piece is the specific mix of subjectivity-driven negative selection and an auxiliary attention-alignment term that uses POS-weighted masks from human factual rationales. That combination is not in the Contriever or standard hard-negative baselines they cite. The paper shows a clear attempt to move the retriever past topical overlap toward token-level evidence in a clinical-trial corpus, which is a reasonable direction for factuality work.

What stands out is the effort to supervise attention directly with external evidence signals rather than hoping contrastive loss alone produces faithful explanations. The method description is straightforward and the hypothesis is stated plainly.

The soft spot is the complete absence of numbers, dataset sizes, error bars, or ablation results in the abstract. Without those, it is impossible to judge whether the claimed improvements in retrieval effectiveness and faithfulness are real, large, or statistically supported. The stress-test concern about the reliability of the human rationales and the POS masking also lands: if the annotations are noisy, domain-specific, or do not actually mark the tokens a faithful model should attend to, the auxiliary loss could be fitting annotation artifacts rather than producing better explanations. No inter-annotator stats or masking ablations are referenced.

This is for readers already working on interpretable dense retrieval and RAG faithfulness, especially in high-stakes domains. A serious referee should see the full results section and any checks on annotation quality before deciding. I would send it to review rather than desk-reject, but the verdict stays open until the empirical details are visible.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces Contrastive Evidence Rationale Attention (CERA), a dense retriever fine-tuning method that augments standard contrastive (triplet) learning with subjectivity-based hard negative selection and an auxiliary attention-alignment loss. The auxiliary loss constructs a target distribution over tokens by applying part-of-speech weighting to human-annotated factual rationales and supervises the CLS-to-token attention of the retriever to align with this distribution. Experiments on a clinical-trial-report corpus are claimed to show that the hard-negative component improves retrieval effectiveness over Contriever and standard hard-negative baselines, while the rationale-alignment component improves faithfulness metrics without degrading retrieval performance.

Significance. If the reported gains are robust and the annotation-derived supervision generalizes, the work supplies a concrete mechanism for moving retrieval beyond topical similarity toward evidence-specific token selection, which could improve both effectiveness and interpretability in RAG pipelines. The explicit coupling of human rationales to attention alignment is a distinctive inductive bias that is not present in prior contrastive retrievers.

major comments (2)

[Method (attention alignment loss)] Method section (attention alignment loss): the target distribution is derived from human-annotated rationales via POS-weighted masking; the manuscript provides neither inter-annotator agreement statistics nor an ablation that isolates the masking distribution from the contrastive objective, leaving the central claim that this produces more faithful explanations dependent on an unverified premise about annotation quality and consistency.
[Experiments] Experiments section: the abstract asserts that subjectivity-based hard negative selection 'substantially improves retrieval effectiveness' and that rationale alignment 'improves faithfulness,' yet the provided text supplies no quantitative metrics, dataset sizes, error bars, or statistical tests; without these the empirical support for the two central claims cannot be evaluated.

minor comments (1)

[Abstract] The abstract would be strengthened by including at least one key quantitative result (e.g., nDCG@10 or faithfulness delta) rather than purely qualitative statements.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive comments. We address each major comment point by point below and indicate where revisions will be made.

read point-by-point responses

Referee: [Method (attention alignment loss)] Method section (attention alignment loss): the target distribution is derived from human-annotated rationales via POS-weighted masking; the manuscript provides neither inter-annotator agreement statistics nor an ablation that isolates the masking distribution from the contrastive objective, leaving the central claim that this produces more faithful explanations dependent on an unverified premise about annotation quality and consistency.

Authors: We acknowledge that inter-annotator agreement statistics are absent because the rationales were collected from a single annotator. We will add an ablation that removes the POS weighting from the target distribution while keeping the contrastive objective fixed, to isolate its effect on faithfulness metrics. These results will be added to the revised method and experiments sections. revision: partial
Referee: [Experiments] Experiments section: the abstract asserts that subjectivity-based hard negative selection 'substantially improves retrieval effectiveness' and that rationale alignment 'improves faithfulness,' yet the provided text supplies no quantitative metrics, dataset sizes, error bars, or statistical tests; without these the empirical support for the two central claims cannot be evaluated.

Authors: The current manuscript text does not embed the quantitative results, dataset sizes, error bars, or statistical tests directly in the narrative. We will revise the experiments section to include these details explicitly (e.g., corpus size, nDCG/Recall values with standard deviations, and paired significance tests) so that the support for both claims is fully evaluable. revision: yes

standing simulated objections not resolved

Inter-annotator agreement statistics for the rationale annotations (single-annotator collection process)

Circularity Check

0 steps flagged

No significant circularity; method uses external human annotations and standard objectives

full rationale

The paper introduces CERA via triplet contrastive loss plus an auxiliary attention-alignment loss whose target is constructed from external human-annotated factual rationales (via POS-weighted masking). No equations, fitted parameters, or self-citations are shown to reduce any claimed prediction or faithfulness gain to a quantity defined inside the paper itself. The central claims rest on experimental comparison against Contriever and hard-negative baselines on an external clinical-trial corpus, not on any self-definitional or self-referential construction. This is the normal non-circular case for a supervised retrieval method whose supervision signal originates outside the model.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review performed on abstract only; no explicit free parameters, axioms, or invented entities are stated. The method implicitly assumes that human rationales are high-quality evidence signals and that attention weights can be meaningfully supervised.

pith-pipeline@v0.9.1-grok · 5747 in / 1276 out tokens · 19422 ms · 2026-06-28T16:54:10.209357+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

66 extracted references · 44 canonical work pages · 1 internal anchor

[1]

Aho and Jeffrey D

Alfred V. Aho and Jeffrey D. Ullman , title =. 1972

1972
[2]

Mitchell , title =

Tom M. Mitchell , title =. 1980 , url =

1980
[3]

Publications Manual , year = "1983", publisher =

1983
[4]

Chandra and Dexter C

Ashok K. Chandra and Dexter C. Kozen and Larry J. Stockmeyer , year = "1981", title =. doi:10.1145/322234.322243

work page doi:10.1145/322234.322243 1981
[5]

Sparse Latents Steer Retrieval-Augmented Generation

Xin, Chunlei and Zhou, Shuheng and Zhu, Huijia and Wang, Weiqiang and Chen, Xuanang and Guan, Xinyan and Lu, Yaojie and Lin, Hongyu and Han, Xianpei and Sun, Le. Sparse Latents Steer Retrieval-Augmented Generation. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.ac...

work page doi:10.18653/v1/2025.acl-long.228 2025
[6]

2019 , eprint=

Attention Interpretability Across NLP Tasks , author=. 2019 , eprint=

2019
[7]

Is Attention Interpretable?

Serrano, Sofia and Smith, Noah A. Is Attention Interpretable?. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. doi:10.18653/v1/P19-1282

work page doi:10.18653/v1/p19-1282 2019
[8]

Proceedings of the AAAI Conference on Artificial Intelligence , author=

Aligning Attention with Human Rationales for Self-Explaining Hate Speech Detection , volume=. Proceedings of the AAAI Conference on Artificial Intelligence , author=. 2026 , month=. doi:10.1609/aaai.v40i44.41069 , abstractNote=

work page doi:10.1609/aaai.v40i44.41069 2026
[9]

Findings of the Association for Computational Linguistics: ACL 2026 , pages =

Self-Explaining Hate Speech Detection with Moral Rationales , author =. Findings of the Association for Computational Linguistics: ACL 2026 , pages =. 2026 , url =

2026
[10]

Improving the Faithfulness of Attention-based Explanations with Task-specific Information for Text Classification

Chrysostomou, George and Aletras, Nikolaos. Improving the Faithfulness of Attention-based Explanations with Task-specific Information for Text Classification. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021. doi:...

work page doi:10.18653/v1/2021.acl-long.40 2021
[11]

The Annals of Mathematical Statistics , volume=

On Information and Sufficiency , author=. The Annals of Mathematical Statistics , volume=. 1951 , publisher=

1951
[12]

Attention is not not Explanation

Wiegreffe, Sarah and Pinter, Yuval. Attention is not not Explanation. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019. doi:10.18653/v1/D19-1002

work page doi:10.18653/v1/d19-1002 2019
[13]

ERASER : A Benchmark to Evaluate Rationalized NLP Models

DeYoung, Jay and Jain, Sarthak and Rajani, Nazneen Fatema and Lehman, Eric and Xiong, Caiming and Socher, Richard and Wallace, Byron C. ERASER : A Benchmark to Evaluate Rationalized NLP Models. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020. doi:10.18653/v1/2020.acl-main.408

work page doi:10.18653/v1/2020.acl-main.408 2020
[14]

RE - RAG : Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation

Kim, Kiseung and Lee, Jay-Yoon. RE - RAG : Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. doi:10.18653/v1/2024.emnlp-main.1236

work page doi:10.18653/v1/2024.emnlp-main.1236 2024
[15]

Scalable training of

Andrew, Galen and Gao, Jianfeng , booktitle=. Scalable training of
[16]

Dan Gusfield , title =. 1997

1997
[17]

Tetreault , title =

Mohammad Sadegh Rasooli and Joel R. Tetreault , title =. Computing Research Repository , volume =. 2015 , url =

2015
[18]

A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =

Ando, Rie Kubota and Zhang, Tong , Issn =. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =. Journal of Machine Learning Research , Month = dec, Numpages =
[19]

Toward Structured Knowledge Reasoning: Contrastive Retrieval-Augmented Generation on Experience

Gu, Jiawei and Xian, Ziting and Xie, Yuanzhen and Liu, Ye and Liu, Enjie and Zhong, Ruichao and Gao, Mochi and Tan, Yunzhi and Hu, Bo and Li, Zang. Toward Structured Knowledge Reasoning: Contrastive Retrieval-Augmented Generation on Experience. Findings of the Association for Computational Linguistics: ACL 2025. 2025. doi:10.18653/v1/2025.findings-acl.1224

work page doi:10.18653/v1/2025.findings-acl.1224 2025
[20]

RAGT ruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models

Niu, Cheng and Wu, Yuanhao and Zhu, Juno and Xu, Siliang and Shum, KaShun and Zhong, Randy and Song, Juntong and Zhang, Tong. RAGT ruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. doi:10.18653/v...

work page doi:10.18653/v1/2024.acl-long.585 2024
[21]

Inferring Which Medical Treatments Work from Reports of Clinical Trials

Lehman, Eric and DeYoung, Jay and Barzilay, Regina and Wallace, Byron C. Inferring Which Medical Treatments Work from Reports of Clinical Trials. Proceedings of the 2019 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019. doi:10.18653/v1/N19-1371

work page doi:10.18653/v1/n19-1371 2019
[22]

Eliciting Critical Reasoning in Retrieval-Augmented Generation via Contrastive Explanations

Ranaldi, Leonardo and Valentino, Marco and Freitas, Andre. Eliciting Critical Reasoning in Retrieval-Augmented Generation via Contrastive Explanations. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.n...

work page doi:10.18653/v1/2025.naacl-long.557 2025
[23]

Large language models can be easily distracted by irrelevant context , year =

Shi, Freda and Chen, Xinyun and Misra, Kanishka and Scales, Nathan and Dohan, David and Chi, Ed and Sch\". Large language models can be easily distracted by irrelevant context , year =. Proceedings of the 40th International Conference on Machine Learning , articleno =
[24]

2020 , eprint=

How Context Affects Language Models' Factual Predictions , author=. 2020 , eprint=

2020
[25]

Passage-based BM 25 Hard Negatives: A Simple and Effective Negative Sampling Strategy For Dense Retrieval

Nguyen, Thanh-Do and Bui, Chi Minh and Vuong, Thi-Hai-Yen and Phan, Xuan-Hieu. Passage-based BM 25 Hard Negatives: A Simple and Effective Negative Sampling Strategy For Dense Retrieval. Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation. 2023

2023
[26]

Moreira, Radek Osmulski, Mengyao Xu, Ronay Ak, Benedikt Schifferer, and Even Oldridge

Moreira, Gabriel de Souza P. and Osmulski, Radek and Xu, Mengyao and Ak, Ronay and Schifferer, Benedikt and Oldridge, Even , title =. Proceedings of the 34th ACM International Conference on Information and Knowledge Management , pages =. 2025 , isbn =. doi:10.1145/3746252.3761254 , abstract =

work page doi:10.1145/3746252.3761254 2025
[27]

2025 , eprint=

Ranking Free RAG: Replacing Re-ranking with Selection in RAG for Sensitive Domains , author=. 2025 , eprint=

2025
[28]

and Lee, Su-In , title =

Lundberg, Scott M. and Lee, Su-In , title =. Proceedings of the 31st International Conference on Neural Information Processing Systems , pages =. 2017 , isbn =

2017
[29]

2016 , isbn =

Ribeiro, Marco Tulio and Singh, Sameer and Guestrin, Carlos , title =. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pages =. 2016 , isbn =. doi:10.1145/2939672.2939778 , abstract =

work page doi:10.1145/2939672.2939778 2016
[30]

Learning Interpretable Legal Case Retrieval via Knowledge-Guided Case Reformulation

Deng, Chenlong and Mao, Kelong and Dou, Zhicheng. Learning Interpretable Legal Case Retrieval via Knowledge-Guided Case Reformulation. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. doi:10.18653/v1/2024.emnlp-main.73

work page doi:10.18653/v1/2024.emnlp-main.73 2024
[31]

and Shu, Jiang and Gao, Jiaxin and Engelhard, Matthew M

Li, Fengnan and Hill, Elliot D. and Shu, Jiang and Gao, Jiaxin and Engelhard, Matthew M. IRIS : Interpretable Retrieval-Augmented Classification for Long Interspersed Document Sequences. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.acl-long.1461

work page doi:10.18653/v1/2025.acl-long.1461 2025
[32]

Evaluation of Attribution Bias in Generator-Aware Retrieval-Augmented Large Language Models

Abolghasemi, Amin and Azzopardi, Leif and Hashemi, Seyyed Hadi and de Rijke, Maarten and Verberne, Suzan. Evaluation of Attribution Bias in Generator-Aware Retrieval-Augmented Large Language Models. Findings of the Association for Computational Linguistics: ACL 2025. 2025. doi:10.18653/v1/2025.findings-acl.1087

work page doi:10.18653/v1/2025.findings-acl.1087 2025
[33]

Contrastive Explanations for Model Interpretability

Jacovi, Alon and Swayamdipta, Swabha and Ravfogel, Shauli and Elazar, Yanai and Choi, Yejin and Goldberg, Yoav. Contrastive Explanations for Model Interpretability. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021. doi:10.18653/v1/2021.emnlp-main.120

work page doi:10.18653/v1/2021.emnlp-main.120 2021
[34]

Evaluating Verifiability in Generative Search Engines

Liu, Nelson and Zhang, Tianyi and Liang, Percy. Evaluating Verifiability in Generative Search Engines. Findings of the Association for Computational Linguistics: EMNLP 2023. 2023. doi:10.18653/v1/2023.findings-emnlp.467

work page doi:10.18653/v1/2023.findings-emnlp.467 2023
[35]

In-Context Retrieval-Augmented Language Models

Ram, Ori and Levine, Yoav and Dalmedigos, Itay and Muhlgay, Dor and Shashua, Amnon and Leyton-Brown, Kevin and Shoham, Yoav. In-Context Retrieval-Augmented Language Models. Transactions of the Association for Computational Linguistics. 2023. doi:10.1162/tacl_a_00605

work page doi:10.1162/tacl_a_00605 2023
[36]

Retrieval-augmented generation for knowledge-intensive NLP tasks , year =

Lewis, Patrick and Perez, Ethan and Piktus, Aleksandra and Petroni, Fabio and Karpukhin, Vladimir and Goyal, Naman and K\". Retrieval-augmented generation for knowledge-intensive NLP tasks , year =
[37]

Improving the Factual Correctness of Radiology Report Generation with Semantic Rewards

Delbrouck, Jean-Benoit and Chambon, Pierre and Bluethgen, Christian and Tsai, Emily and Almusa, Omar and Langlotz, Curtis. Improving the Factual Correctness of Radiology Report Generation with Semantic Rewards. Findings of the Association for Computational Linguistics: EMNLP 2022. 2022. doi:10.18653/v1/2022.findings-emnlp.319

work page doi:10.18653/v1/2022.findings-emnlp.319 2022
[38]

2024 , eprint=

Factual Serialization Enhancement: A Key Innovation for Chest X-ray Report Generation , author=. 2024 , eprint=

2024
[39]

General Data Protection Regulation , year =. Regulation(eu) 2016/679 of the european parliament and of the council of 27 april 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46 , journal =

2016
[40]

2022 , eprint=

Unsupervised Dense Information Retrieval with Contrastive Learning , author=. 2022 , eprint=

2022
[41]

Poetry in RAG s: M odern G reek interwar poetry generation using RAG and contrastive training

Chatzikyriakidis, Stergios and Natsina, Anastasia. Poetry in RAG s: M odern G reek interwar poetry generation using RAG and contrastive training. Proceedings of the 5th International Conference on Natural Language Processing for Digital Humanities. 2025. doi:10.18653/v1/2025.nlp4dh-1.22

work page doi:10.18653/v1/2025.nlp4dh-1.22 2025
[42]

Contrastive Learning to Improve Retrieval for Real-World Fact Checking

Sriram, Aniruddh and Xu, Fangyuan and Choi, Eunsol and Durrett, Greg. Contrastive Learning to Improve Retrieval for Real-World Fact Checking. Proceedings of the Seventh Fact Extraction and VERification Workshop (FEVER). 2024. doi:10.18653/v1/2024.fever-1.28

work page doi:10.18653/v1/2024.fever-1.28 2024
[43]

and Caliskan, Aylin

Ghate, Kshitish and Charlesworth, Tessa and Diab, Mona T. and Caliskan, Aylin. Biases Propagate in Encoder-based Vision-Language Models: A Systematic Analysis From Intrinsic Measures to Zero-shot Retrieval Outcomes. Findings of the Association for Computational Linguistics: ACL 2025. 2025. doi:10.18653/v1/2025.findings-acl.955

work page doi:10.18653/v1/2025.findings-acl.955 2025
[44]

ALICE : Active Learning with Contrastive Natural Language Explanations

Liang, Weixin and Zou, James and Yu, Zhou. ALICE : Active Learning with Contrastive Natural Language Explanations. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. doi:10.18653/v1/2020.emnlp-main.355

work page doi:10.18653/v1/2020.emnlp-main.355 2020
[45]

E qualize IR : Mitigating Linguistic Biases in Retrieval Models

Cheng, Jiali and Amiri, Hadi. E qualize IR : Mitigating Linguistic Biases in Retrieval Models. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers). 2025. doi:10.18653/v1/2025.naacl-short.75

work page doi:10.18653/v1/2025.naacl-short.75 2025
[46]

Medical Graph RAG : Evidence-based Medical Large Language Model via Graph Retrieval-Augmented Generation

Wu, Junde and Zhu, Jiayuan and Qi, Yunli and Chen, Jingkun and Xu, Min and Menolascina, Filippo and Jin, Yueming and Grau, Vicente. Medical Graph RAG : Evidence-based Medical Large Language Model via Graph Retrieval-Augmented Generation. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. ...

work page doi:10.18653/v1/2025.acl-long.1381 2025
[47]

Discovering Biases in Information Retrieval Models Using Relevance Thesaurus as Global Explanation

Kim, Youngwoo and Rahimi, Razieh and Allan, James. Discovering Biases in Information Retrieval Models Using Relevance Thesaurus as Global Explanation. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. doi:10.18653/v1/2024.emnlp-main.1089

work page doi:10.18653/v1/2024.emnlp-main.1089 2024
[48]

LLM s are Biased Evaluators But Not Biased for Fact-Centric Retrieval Augmented Generation

Chen, Yen-Shan and Jin, Jing and Kuo, Peng-Ting and Huang, Chao-Wei and Chen, Yun-Nung. LLM s are Biased Evaluators But Not Biased for Fact-Centric Retrieval Augmented Generation. Findings of the Association for Computational Linguistics: ACL 2025. 2025. doi:10.18653/v1/2025.findings-acl.1369

work page doi:10.18653/v1/2025.findings-acl.1369 2025
[49]

Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation

Sun, Liwen and Zhao, James Jialun and Han, Wenjing and Xiong, Chenyan. Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2025. doi:10.1...

work page doi:10.18653/v1/2025.naacl-long.28 2025
[50]

Removal of Hallucination on Hallucination: Debate-Augmented RAG

Hu, Wentao and Zhang, Wengyu and Jiang, Yiyang and Zhang, Chen Jason and Wei, Xiaoyong and Qing, Li. Removal of Hallucination on Hallucination: Debate-Augmented RAG. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.acl-long.770

work page doi:10.18653/v1/2025.acl-long.770 2025
[51]

Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation

Krishna, Satyapriya and Krishna, Kalpesh and Mohananey, Anhad and Schwarcz, Steven and Stambler, Adam and Upadhyay, Shyam and Faruqui, Manaal. Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Lan...

work page doi:10.18653/v1/2025.naacl-long.243 2025
[52]

The TXM Platform: Building Open-Source Textual Analysis Software Compatible with the TEI Encoding Scheme

Heiden, Serge. The TXM Platform: Building Open-Source Textual Analysis Software Compatible with the TEI Encoding Scheme. Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation. 2010

2010
[53]

Journal of Legal Analysis , volume =

Dahl, Matthew and Magesh, Varun and Suzgun, Mirac and Ho, Daniel E , title =. Journal of Legal Analysis , volume =. 2024 , month =. doi:10.1093/jla/laae003 , url =

work page doi:10.1093/jla/laae003 2024
[54]

Community-Informed AI Models for Police Accountability

A multi-perspective machine learning approach to evaluate police-driver interaction in Los Angeles , author=. arXiv preprint arXiv:2402.01703 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[55]

, title =

Turpin, Miles and Michael, Julian and Perez, Ethan and Bowman, Samuel R. , title =. Proceedings of the 37th International Conference on Neural Information Processing Systems , articleno =. 2023 , publisher =

2023
[56]

2026 , eprint=

Can LLMs Score Medical Diagnoses and Clinical Reasoning as well as Expert Panels? , author=. 2026 , eprint=

2026
[57]

Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space

Geva, Mor and Caciularu, Avi and Wang, Kevin and Goldberg, Yoav. Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 2022. doi:10.18653/v1/2022.emnlp-main.3

work page doi:10.18653/v1/2022.emnlp-main.3 2022
[58]

BERT , are you paying attention? Attention regularization with human-annotated rationales

Herrewijnen, Elize and Nguyen, Dong and Bex, Floris and Gatt, Albert. BERT , are you paying attention? Attention regularization with human-annotated rationales. Proceedings of the 19th Conference of the E uropean Chapter of the A ssociation for C omputational L inguistics. 2026. doi:10.18653/v1/2026.eacl-long.31

work page doi:10.18653/v1/2026.eacl-long.31 2026
[59]

Contrastive learning with hard negative samples

Robinson, Joshua and Chuang, Ching-Yao and Sra, Suvrit and Jegelka, Stefanie , year =. Contrastive. doi:10.48550/ARXIV.2010.04592 , abstract =

work page doi:10.48550/arxiv.2010.04592 2010
[60]

Karpukhin, Vladimir and Oguz, Barlas and Min, Sewon and Lewis, Patrick and Wu, Ledell and Edunov, Sergey and Chen, Danqi and Yih, Wen-tau , year =. Dense. Proceedings of the 2020. doi:10.18653/v1/2020.emnlp-main.550 , language =

work page doi:10.18653/v1/2020.emnlp-main.550 2020
[61]

Approximate

Xiong, Lee and Xiong, Chenyan and Li, Ye and Tang, Kwok-Fung and Liu, Jialin and Bennett, Paul and Ahmed, Junaid and Overwijk, Arnold , year =. Approximate. doi:10.48550/ARXIV.2007.00808 , abstract =

work page doi:10.48550/arxiv.2007.00808 2007
[62]

S im CSE : Simple Contrastive Learning of Sentence Embeddings

Gao, Tianyu and Yao, Xingcheng and Chen, Danqi , year =. Proceedings of the 2021. doi:10.18653/v1/2021.emnlp-main.552 , language =

work page doi:10.18653/v1/2021.emnlp-main.552 2021
[63]

Cai, Yinqiong and Guo, Jiafeng and Fan, Yixing and Ai, Qingyao and Zhang, Ruqing and Cheng, Xueqi , month = oct, year =. Hard. Proceedings of the 31st. doi:10.1145/3511808.3557343 , language =

work page doi:10.1145/3511808.3557343
[64]

Proceedings of the 2022

Zhou, Kun and Gong, Yeyun and Liu, Xiao and Zhao, Wayne Xin and Shen, Yelong and Dong, Anlei and Lu, Jingwen and Majumder, Rangan and Wen, Ji-rong and Duan, Nan , year =. Proceedings of the 2022. doi:10.18653/v1/2022.emnlp-industry.56 , language =

work page doi:10.18653/v1/2022.emnlp-industry.56 2022
[65]

Negative

Wischounig, Laurin and Abdallah, Abdelrahman and Jatowt, Adam , year =. Negative. Findings of the. doi:10.18653/v1/2026.findings-eacl.157 , language =

work page doi:10.18653/v1/2026.findings-eacl.157 2026
[66]

Generating Biographies on W ikipedia: The Impact of Gender Bias on the Retrieval-Based Generation of Women Biographies

Fan, Angela and Gardent, Claire. Generating Biographies on W ikipedia: The Impact of Gender Bias on the Retrieval-Based Generation of Women Biographies. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022. doi:10.18653/v1/2022.acl-long.586

work page doi:10.18653/v1/2022.acl-long.586 2022

[1] [1]

Aho and Jeffrey D

Alfred V. Aho and Jeffrey D. Ullman , title =. 1972

1972

[2] [2]

Mitchell , title =

Tom M. Mitchell , title =. 1980 , url =

1980

[3] [3]

Publications Manual , year = "1983", publisher =

1983

[4] [4]

Chandra and Dexter C

Ashok K. Chandra and Dexter C. Kozen and Larry J. Stockmeyer , year = "1981", title =. doi:10.1145/322234.322243

work page doi:10.1145/322234.322243 1981

[5] [5]

Sparse Latents Steer Retrieval-Augmented Generation

Xin, Chunlei and Zhou, Shuheng and Zhu, Huijia and Wang, Weiqiang and Chen, Xuanang and Guan, Xinyan and Lu, Yaojie and Lin, Hongyu and Han, Xianpei and Sun, Le. Sparse Latents Steer Retrieval-Augmented Generation. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.ac...

work page doi:10.18653/v1/2025.acl-long.228 2025

[6] [6]

2019 , eprint=

Attention Interpretability Across NLP Tasks , author=. 2019 , eprint=

2019

[7] [7]

Is Attention Interpretable?

Serrano, Sofia and Smith, Noah A. Is Attention Interpretable?. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. doi:10.18653/v1/P19-1282

work page doi:10.18653/v1/p19-1282 2019

[8] [8]

Proceedings of the AAAI Conference on Artificial Intelligence , author=

Aligning Attention with Human Rationales for Self-Explaining Hate Speech Detection , volume=. Proceedings of the AAAI Conference on Artificial Intelligence , author=. 2026 , month=. doi:10.1609/aaai.v40i44.41069 , abstractNote=

work page doi:10.1609/aaai.v40i44.41069 2026

[9] [9]

Findings of the Association for Computational Linguistics: ACL 2026 , pages =

Self-Explaining Hate Speech Detection with Moral Rationales , author =. Findings of the Association for Computational Linguistics: ACL 2026 , pages =. 2026 , url =

2026

[10] [10]

Improving the Faithfulness of Attention-based Explanations with Task-specific Information for Text Classification

Chrysostomou, George and Aletras, Nikolaos. Improving the Faithfulness of Attention-based Explanations with Task-specific Information for Text Classification. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021. doi:...

work page doi:10.18653/v1/2021.acl-long.40 2021

[11] [11]

The Annals of Mathematical Statistics , volume=

On Information and Sufficiency , author=. The Annals of Mathematical Statistics , volume=. 1951 , publisher=

1951

[12] [12]

Attention is not not Explanation

Wiegreffe, Sarah and Pinter, Yuval. Attention is not not Explanation. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019. doi:10.18653/v1/D19-1002

work page doi:10.18653/v1/d19-1002 2019

[13] [13]

ERASER : A Benchmark to Evaluate Rationalized NLP Models

DeYoung, Jay and Jain, Sarthak and Rajani, Nazneen Fatema and Lehman, Eric and Xiong, Caiming and Socher, Richard and Wallace, Byron C. ERASER : A Benchmark to Evaluate Rationalized NLP Models. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020. doi:10.18653/v1/2020.acl-main.408

work page doi:10.18653/v1/2020.acl-main.408 2020

[14] [14]

RE - RAG : Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation

Kim, Kiseung and Lee, Jay-Yoon. RE - RAG : Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. doi:10.18653/v1/2024.emnlp-main.1236

work page doi:10.18653/v1/2024.emnlp-main.1236 2024

[15] [15]

Scalable training of

Andrew, Galen and Gao, Jianfeng , booktitle=. Scalable training of

[16] [16]

Dan Gusfield , title =. 1997

1997

[17] [17]

Tetreault , title =

Mohammad Sadegh Rasooli and Joel R. Tetreault , title =. Computing Research Repository , volume =. 2015 , url =

2015

[18] [18]

A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =

Ando, Rie Kubota and Zhang, Tong , Issn =. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =. Journal of Machine Learning Research , Month = dec, Numpages =

[19] [19]

Toward Structured Knowledge Reasoning: Contrastive Retrieval-Augmented Generation on Experience

Gu, Jiawei and Xian, Ziting and Xie, Yuanzhen and Liu, Ye and Liu, Enjie and Zhong, Ruichao and Gao, Mochi and Tan, Yunzhi and Hu, Bo and Li, Zang. Toward Structured Knowledge Reasoning: Contrastive Retrieval-Augmented Generation on Experience. Findings of the Association for Computational Linguistics: ACL 2025. 2025. doi:10.18653/v1/2025.findings-acl.1224

work page doi:10.18653/v1/2025.findings-acl.1224 2025

[20] [20]

RAGT ruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models

Niu, Cheng and Wu, Yuanhao and Zhu, Juno and Xu, Siliang and Shum, KaShun and Zhong, Randy and Song, Juntong and Zhang, Tong. RAGT ruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. doi:10.18653/v...

work page doi:10.18653/v1/2024.acl-long.585 2024

[21] [21]

Inferring Which Medical Treatments Work from Reports of Clinical Trials

Lehman, Eric and DeYoung, Jay and Barzilay, Regina and Wallace, Byron C. Inferring Which Medical Treatments Work from Reports of Clinical Trials. Proceedings of the 2019 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019. doi:10.18653/v1/N19-1371

work page doi:10.18653/v1/n19-1371 2019

[22] [22]

Eliciting Critical Reasoning in Retrieval-Augmented Generation via Contrastive Explanations

Ranaldi, Leonardo and Valentino, Marco and Freitas, Andre. Eliciting Critical Reasoning in Retrieval-Augmented Generation via Contrastive Explanations. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.n...

work page doi:10.18653/v1/2025.naacl-long.557 2025

[23] [23]

Large language models can be easily distracted by irrelevant context , year =

Shi, Freda and Chen, Xinyun and Misra, Kanishka and Scales, Nathan and Dohan, David and Chi, Ed and Sch\". Large language models can be easily distracted by irrelevant context , year =. Proceedings of the 40th International Conference on Machine Learning , articleno =

[24] [24]

2020 , eprint=

How Context Affects Language Models' Factual Predictions , author=. 2020 , eprint=

2020

[25] [25]

Passage-based BM 25 Hard Negatives: A Simple and Effective Negative Sampling Strategy For Dense Retrieval

Nguyen, Thanh-Do and Bui, Chi Minh and Vuong, Thi-Hai-Yen and Phan, Xuan-Hieu. Passage-based BM 25 Hard Negatives: A Simple and Effective Negative Sampling Strategy For Dense Retrieval. Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation. 2023

2023

[26] [26]

Moreira, Radek Osmulski, Mengyao Xu, Ronay Ak, Benedikt Schifferer, and Even Oldridge

Moreira, Gabriel de Souza P. and Osmulski, Radek and Xu, Mengyao and Ak, Ronay and Schifferer, Benedikt and Oldridge, Even , title =. Proceedings of the 34th ACM International Conference on Information and Knowledge Management , pages =. 2025 , isbn =. doi:10.1145/3746252.3761254 , abstract =

work page doi:10.1145/3746252.3761254 2025

[27] [27]

2025 , eprint=

Ranking Free RAG: Replacing Re-ranking with Selection in RAG for Sensitive Domains , author=. 2025 , eprint=

2025

[28] [28]

and Lee, Su-In , title =

Lundberg, Scott M. and Lee, Su-In , title =. Proceedings of the 31st International Conference on Neural Information Processing Systems , pages =. 2017 , isbn =

2017

[29] [29]

2016 , isbn =

Ribeiro, Marco Tulio and Singh, Sameer and Guestrin, Carlos , title =. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pages =. 2016 , isbn =. doi:10.1145/2939672.2939778 , abstract =

work page doi:10.1145/2939672.2939778 2016

[30] [30]

Learning Interpretable Legal Case Retrieval via Knowledge-Guided Case Reformulation

Deng, Chenlong and Mao, Kelong and Dou, Zhicheng. Learning Interpretable Legal Case Retrieval via Knowledge-Guided Case Reformulation. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. doi:10.18653/v1/2024.emnlp-main.73

work page doi:10.18653/v1/2024.emnlp-main.73 2024

[31] [31]

and Shu, Jiang and Gao, Jiaxin and Engelhard, Matthew M

Li, Fengnan and Hill, Elliot D. and Shu, Jiang and Gao, Jiaxin and Engelhard, Matthew M. IRIS : Interpretable Retrieval-Augmented Classification for Long Interspersed Document Sequences. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.acl-long.1461

work page doi:10.18653/v1/2025.acl-long.1461 2025

[32] [32]

Evaluation of Attribution Bias in Generator-Aware Retrieval-Augmented Large Language Models

Abolghasemi, Amin and Azzopardi, Leif and Hashemi, Seyyed Hadi and de Rijke, Maarten and Verberne, Suzan. Evaluation of Attribution Bias in Generator-Aware Retrieval-Augmented Large Language Models. Findings of the Association for Computational Linguistics: ACL 2025. 2025. doi:10.18653/v1/2025.findings-acl.1087

work page doi:10.18653/v1/2025.findings-acl.1087 2025

[33] [33]

Contrastive Explanations for Model Interpretability

Jacovi, Alon and Swayamdipta, Swabha and Ravfogel, Shauli and Elazar, Yanai and Choi, Yejin and Goldberg, Yoav. Contrastive Explanations for Model Interpretability. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021. doi:10.18653/v1/2021.emnlp-main.120

work page doi:10.18653/v1/2021.emnlp-main.120 2021

[34] [34]

Evaluating Verifiability in Generative Search Engines

Liu, Nelson and Zhang, Tianyi and Liang, Percy. Evaluating Verifiability in Generative Search Engines. Findings of the Association for Computational Linguistics: EMNLP 2023. 2023. doi:10.18653/v1/2023.findings-emnlp.467

work page doi:10.18653/v1/2023.findings-emnlp.467 2023

[35] [35]

In-Context Retrieval-Augmented Language Models

Ram, Ori and Levine, Yoav and Dalmedigos, Itay and Muhlgay, Dor and Shashua, Amnon and Leyton-Brown, Kevin and Shoham, Yoav. In-Context Retrieval-Augmented Language Models. Transactions of the Association for Computational Linguistics. 2023. doi:10.1162/tacl_a_00605

work page doi:10.1162/tacl_a_00605 2023

[36] [36]

Retrieval-augmented generation for knowledge-intensive NLP tasks , year =

Lewis, Patrick and Perez, Ethan and Piktus, Aleksandra and Petroni, Fabio and Karpukhin, Vladimir and Goyal, Naman and K\". Retrieval-augmented generation for knowledge-intensive NLP tasks , year =

[37] [37]

Improving the Factual Correctness of Radiology Report Generation with Semantic Rewards

Delbrouck, Jean-Benoit and Chambon, Pierre and Bluethgen, Christian and Tsai, Emily and Almusa, Omar and Langlotz, Curtis. Improving the Factual Correctness of Radiology Report Generation with Semantic Rewards. Findings of the Association for Computational Linguistics: EMNLP 2022. 2022. doi:10.18653/v1/2022.findings-emnlp.319

work page doi:10.18653/v1/2022.findings-emnlp.319 2022

[38] [38]

2024 , eprint=

Factual Serialization Enhancement: A Key Innovation for Chest X-ray Report Generation , author=. 2024 , eprint=

2024

[39] [39]

General Data Protection Regulation , year =. Regulation(eu) 2016/679 of the european parliament and of the council of 27 april 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46 , journal =

2016

[40] [40]

2022 , eprint=

Unsupervised Dense Information Retrieval with Contrastive Learning , author=. 2022 , eprint=

2022

[41] [41]

Poetry in RAG s: M odern G reek interwar poetry generation using RAG and contrastive training

Chatzikyriakidis, Stergios and Natsina, Anastasia. Poetry in RAG s: M odern G reek interwar poetry generation using RAG and contrastive training. Proceedings of the 5th International Conference on Natural Language Processing for Digital Humanities. 2025. doi:10.18653/v1/2025.nlp4dh-1.22

work page doi:10.18653/v1/2025.nlp4dh-1.22 2025

[42] [42]

Contrastive Learning to Improve Retrieval for Real-World Fact Checking

Sriram, Aniruddh and Xu, Fangyuan and Choi, Eunsol and Durrett, Greg. Contrastive Learning to Improve Retrieval for Real-World Fact Checking. Proceedings of the Seventh Fact Extraction and VERification Workshop (FEVER). 2024. doi:10.18653/v1/2024.fever-1.28

work page doi:10.18653/v1/2024.fever-1.28 2024

[43] [43]

and Caliskan, Aylin

Ghate, Kshitish and Charlesworth, Tessa and Diab, Mona T. and Caliskan, Aylin. Biases Propagate in Encoder-based Vision-Language Models: A Systematic Analysis From Intrinsic Measures to Zero-shot Retrieval Outcomes. Findings of the Association for Computational Linguistics: ACL 2025. 2025. doi:10.18653/v1/2025.findings-acl.955

work page doi:10.18653/v1/2025.findings-acl.955 2025

[44] [44]

ALICE : Active Learning with Contrastive Natural Language Explanations

Liang, Weixin and Zou, James and Yu, Zhou. ALICE : Active Learning with Contrastive Natural Language Explanations. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. doi:10.18653/v1/2020.emnlp-main.355

work page doi:10.18653/v1/2020.emnlp-main.355 2020

[45] [45]

E qualize IR : Mitigating Linguistic Biases in Retrieval Models

Cheng, Jiali and Amiri, Hadi. E qualize IR : Mitigating Linguistic Biases in Retrieval Models. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers). 2025. doi:10.18653/v1/2025.naacl-short.75

work page doi:10.18653/v1/2025.naacl-short.75 2025

[46] [46]

Medical Graph RAG : Evidence-based Medical Large Language Model via Graph Retrieval-Augmented Generation

Wu, Junde and Zhu, Jiayuan and Qi, Yunli and Chen, Jingkun and Xu, Min and Menolascina, Filippo and Jin, Yueming and Grau, Vicente. Medical Graph RAG : Evidence-based Medical Large Language Model via Graph Retrieval-Augmented Generation. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. ...

work page doi:10.18653/v1/2025.acl-long.1381 2025

[47] [47]

Discovering Biases in Information Retrieval Models Using Relevance Thesaurus as Global Explanation

Kim, Youngwoo and Rahimi, Razieh and Allan, James. Discovering Biases in Information Retrieval Models Using Relevance Thesaurus as Global Explanation. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. doi:10.18653/v1/2024.emnlp-main.1089

work page doi:10.18653/v1/2024.emnlp-main.1089 2024

[48] [48]

LLM s are Biased Evaluators But Not Biased for Fact-Centric Retrieval Augmented Generation

Chen, Yen-Shan and Jin, Jing and Kuo, Peng-Ting and Huang, Chao-Wei and Chen, Yun-Nung. LLM s are Biased Evaluators But Not Biased for Fact-Centric Retrieval Augmented Generation. Findings of the Association for Computational Linguistics: ACL 2025. 2025. doi:10.18653/v1/2025.findings-acl.1369

work page doi:10.18653/v1/2025.findings-acl.1369 2025

[49] [49]

Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation

Sun, Liwen and Zhao, James Jialun and Han, Wenjing and Xiong, Chenyan. Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2025. doi:10.1...

work page doi:10.18653/v1/2025.naacl-long.28 2025

[50] [50]

Removal of Hallucination on Hallucination: Debate-Augmented RAG

Hu, Wentao and Zhang, Wengyu and Jiang, Yiyang and Zhang, Chen Jason and Wei, Xiaoyong and Qing, Li. Removal of Hallucination on Hallucination: Debate-Augmented RAG. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.acl-long.770

work page doi:10.18653/v1/2025.acl-long.770 2025

[51] [51]

Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation

Krishna, Satyapriya and Krishna, Kalpesh and Mohananey, Anhad and Schwarcz, Steven and Stambler, Adam and Upadhyay, Shyam and Faruqui, Manaal. Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Lan...

work page doi:10.18653/v1/2025.naacl-long.243 2025

[52] [52]

The TXM Platform: Building Open-Source Textual Analysis Software Compatible with the TEI Encoding Scheme

Heiden, Serge. The TXM Platform: Building Open-Source Textual Analysis Software Compatible with the TEI Encoding Scheme. Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation. 2010

2010

[53] [53]

Journal of Legal Analysis , volume =

Dahl, Matthew and Magesh, Varun and Suzgun, Mirac and Ho, Daniel E , title =. Journal of Legal Analysis , volume =. 2024 , month =. doi:10.1093/jla/laae003 , url =

work page doi:10.1093/jla/laae003 2024

[54] [54]

Community-Informed AI Models for Police Accountability

A multi-perspective machine learning approach to evaluate police-driver interaction in Los Angeles , author=. arXiv preprint arXiv:2402.01703 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[55] [55]

, title =

Turpin, Miles and Michael, Julian and Perez, Ethan and Bowman, Samuel R. , title =. Proceedings of the 37th International Conference on Neural Information Processing Systems , articleno =. 2023 , publisher =

2023

[56] [56]

2026 , eprint=

Can LLMs Score Medical Diagnoses and Clinical Reasoning as well as Expert Panels? , author=. 2026 , eprint=

2026

[57] [57]

Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space

Geva, Mor and Caciularu, Avi and Wang, Kevin and Goldberg, Yoav. Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 2022. doi:10.18653/v1/2022.emnlp-main.3

work page doi:10.18653/v1/2022.emnlp-main.3 2022

[58] [58]

BERT , are you paying attention? Attention regularization with human-annotated rationales

Herrewijnen, Elize and Nguyen, Dong and Bex, Floris and Gatt, Albert. BERT , are you paying attention? Attention regularization with human-annotated rationales. Proceedings of the 19th Conference of the E uropean Chapter of the A ssociation for C omputational L inguistics. 2026. doi:10.18653/v1/2026.eacl-long.31

work page doi:10.18653/v1/2026.eacl-long.31 2026

[59] [59]

Contrastive learning with hard negative samples

Robinson, Joshua and Chuang, Ching-Yao and Sra, Suvrit and Jegelka, Stefanie , year =. Contrastive. doi:10.48550/ARXIV.2010.04592 , abstract =

work page doi:10.48550/arxiv.2010.04592 2010

[60] [60]

Karpukhin, Vladimir and Oguz, Barlas and Min, Sewon and Lewis, Patrick and Wu, Ledell and Edunov, Sergey and Chen, Danqi and Yih, Wen-tau , year =. Dense. Proceedings of the 2020. doi:10.18653/v1/2020.emnlp-main.550 , language =

work page doi:10.18653/v1/2020.emnlp-main.550 2020

[61] [61]

Approximate

Xiong, Lee and Xiong, Chenyan and Li, Ye and Tang, Kwok-Fung and Liu, Jialin and Bennett, Paul and Ahmed, Junaid and Overwijk, Arnold , year =. Approximate. doi:10.48550/ARXIV.2007.00808 , abstract =

work page doi:10.48550/arxiv.2007.00808 2007

[62] [62]

S im CSE : Simple Contrastive Learning of Sentence Embeddings

Gao, Tianyu and Yao, Xingcheng and Chen, Danqi , year =. Proceedings of the 2021. doi:10.18653/v1/2021.emnlp-main.552 , language =

work page doi:10.18653/v1/2021.emnlp-main.552 2021

[63] [63]

Cai, Yinqiong and Guo, Jiafeng and Fan, Yixing and Ai, Qingyao and Zhang, Ruqing and Cheng, Xueqi , month = oct, year =. Hard. Proceedings of the 31st. doi:10.1145/3511808.3557343 , language =

work page doi:10.1145/3511808.3557343

[64] [64]

Proceedings of the 2022

Zhou, Kun and Gong, Yeyun and Liu, Xiao and Zhao, Wayne Xin and Shen, Yelong and Dong, Anlei and Lu, Jingwen and Majumder, Rangan and Wen, Ji-rong and Duan, Nan , year =. Proceedings of the 2022. doi:10.18653/v1/2022.emnlp-industry.56 , language =

work page doi:10.18653/v1/2022.emnlp-industry.56 2022

[65] [65]

Negative

Wischounig, Laurin and Abdallah, Abdelrahman and Jatowt, Adam , year =. Negative. Findings of the. doi:10.18653/v1/2026.findings-eacl.157 , language =

work page doi:10.18653/v1/2026.findings-eacl.157 2026

[66] [66]

Generating Biographies on W ikipedia: The Impact of Gender Bias on the Retrieval-Based Generation of Women Biographies

Fan, Angela and Gardent, Claire. Generating Biographies on W ikipedia: The Impact of Gender Bias on the Retrieval-Based Generation of Women Biographies. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022. doi:10.18653/v1/2022.acl-long.586

work page doi:10.18653/v1/2022.acl-long.586 2022