LLM agents reach 90.9% retrieval recall at K=200 but recover at most 52.7% of ground-truth included studies because they cannot reliably apply PI/ECO eligibility criteria to topically similar distractors.
Norman, Mariska M
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Derives three EVPI-based stopping policies for document screening and shows higher net utility than recall-target methods on CLEF-IP and medical review datasets.
citing papers explorer
-
Benchmarking LLM Agents on Meta-Analysis Articles from Nature Portfolio
LLM agents reach 90.9% retrieval recall at K=200 but recover at most 52.7% of ground-truth included studies because they cannot reliably apply PI/ECO eligibility criteria to topically similar distractors.
-
Decision-Theoretic Stopping Rules for Document Screening
Derives three EVPI-based stopping policies for document screening and shows higher net utility than recall-target methods on CLEF-IP and medical review datasets.