Assessing and mitigating medical knowledge drift and conflicts in large language models

Wu, W · 2025 · arXiv 2505.07968

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Large Language Models Lack Temporal Awareness of Medical Knowledge

cs.LG · 2026-05-13 · unverdicted · novelty 8.0

LLMs lack temporal awareness of medical knowledge, showing gradual performance decline on up-to-date facts, much lower accuracy on historical knowledge (25-54% relative), and inconsistent year-to-year predictions.

RAGe: A Retrieval-Augmented Generation Evaluation Framework

cs.IR · 2026-05-23 · unverdicted · novelty 3.0

RAGe is a modular evaluation framework that correlates retrieval and generation quality with hardware constraints to recommend optimal RAG components for specific datasets.

citing papers explorer

Showing 2 of 2 citing papers.

Large Language Models Lack Temporal Awareness of Medical Knowledge cs.LG · 2026-05-13 · unverdicted · none · ref 39
LLMs lack temporal awareness of medical knowledge, showing gradual performance decline on up-to-date facts, much lower accuracy on historical knowledge (25-54% relative), and inconsistent year-to-year predictions.
RAGe: A Retrieval-Augmented Generation Evaluation Framework cs.IR · 2026-05-23 · unverdicted · none · ref 11
RAGe is a modular evaluation framework that correlates retrieval and generation quality with hardware constraints to recommend optimal RAG components for specific datasets.

Assessing and mitigating medical knowledge drift and conflicts in large language models

fields

years

verdicts

representative citing papers

citing papers explorer