FIDES detects token-level retrieval-memory conflicts via output, hidden, and trajectory signals to selectively apply contrastive decoding, raising context fidelity by 3-13 points over baselines across 18 settings on models up to 70B.
InProceedings of the 2021 Conference on Empirical Methods in Natural Language Process- ing, pages 7052–7063, Online and Punta Cana, Do- minican Republic
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Pretrained lexical priors in language models persist despite explicit remapping rules, as shown by a Stroop paradigm where prior strength predicts interference and activation patching localizes the repair mechanism.
citing papers explorer
-
FIDES: Faithful Inference via Deep Evidence Signals for Retrieval-Memory Conflict in RAG
FIDES detects token-level retrieval-memory conflicts via output, hidden, and trajectory signals to selectively apply contrastive decoding, raising context fidelity by 3-13 points over baselines across 18 settings on models up to 70B.
-
Priors Persist Through Suppression: A Stroop Paradigm for Lexical Override
Pretrained lexical priors in language models persist despite explicit remapping rules, as shown by a Stroop paradigm where prior strength predicts interference and activation patching localizes the repair mechanism.