Contradictions between highly similar medical abstracts degrade the factual accuracy and consistency of LLM responses in retrieval-augmented generation.
arXiv preprint arXiv:2503.17933 (2025)
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
Headache specialists preferred their own literature summaries over those from Sonnet, GPT-4o, and Llama 3.1 in a blinded evaluation, though AI summaries were sometimes indistinguishable.
citing papers explorer
-
Contradictions in Context: Challenges for Retrieval-Augmented Generation in Healthcare
Contradictions between highly similar medical abstracts degrade the factual accuracy and consistency of LLM responses in retrieval-augmented generation.