Introduces NCU metric using token log-probabilities and finds small language models match or outperform larger ones in strict factual RAG extraction, while commercial APIs show high prior dominance and negative transfer.
Revisiting rag retrievers: An information theoretic benchmark.arXiv preprint arXiv:2602.21553, 2026
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Bits-over-Random (BoR) is a chance-corrected metric for tool shortlist evaluation that enables query-adaptive depth selection via RL, matching fixed-list coverage with shorter lists on BFCL and ToolBench.
citing papers explorer
-
How Many Tools Should an LLM Agent See? A Chance-Corrected Answer
Bits-over-Random (BoR) is a chance-corrected metric for tool shortlist evaluation that enables query-adaptive depth selection via RL, matching fixed-list coverage with shorter lists on BFCL and ToolBench.