LLMs lack temporal awareness of medical knowledge, showing gradual performance decline on up-to-date facts, much lower accuracy on historical knowledge (25-54% relative), and inconsistent year-to-year predictions.
Assessing and mitigating medical knowledge drift and conflicts in large language models
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
RAGe is a modular evaluation framework that correlates retrieval and generation quality with hardware constraints to recommend optimal RAG components for specific datasets.
citing papers explorer
-
Large Language Models Lack Temporal Awareness of Medical Knowledge
LLMs lack temporal awareness of medical knowledge, showing gradual performance decline on up-to-date facts, much lower accuracy on historical knowledge (25-54% relative), and inconsistent year-to-year predictions.
-
RAGe: A Retrieval-Augmented Generation Evaluation Framework
RAGe is a modular evaluation framework that correlates retrieval and generation quality with hardware constraints to recommend optimal RAG components for specific datasets.