Partial Evidence Bench is a deterministic benchmark that measures agent correctness, completeness awareness, gap-report quality, and unsafe overclaiming in authorization-constrained evidence environments.
Do llms know when to not answer? investigating abstention abilities of large language models.arXiv preprint arXiv:2407.16221
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
LLM agents overcommit on non-complete tasks at 41.7% unless given explicit support-state categories, which raise typed deferral accuracy to 91.7%.
BAS aggregates utility from an answer-or-abstain model across risk thresholds and is uniquely maximized by truthful confidence estimates.
Language models deploy multidimensional internal confidence representations and threshold-based policies to control abstention behavior, with causal support from activation steering experiments.
citing papers explorer
-
Partial Evidence Bench: Benchmarking Authorization-Limited Evidence in Agentic Systems
Partial Evidence Bench is a deterministic benchmark that measures agent correctness, completeness awareness, gap-report quality, and unsafe overclaiming in authorization-constrained evidence environments.
-
Don't Start What You Can't Finish: A Counterfactual Audit of Support-State Triage in LLM Agents
LLM agents overcommit on non-complete tasks at 41.7% unless given explicit support-state categories, which raise typed deferral accuracy to 91.7%.
-
BAS: A Decision-Theoretic Approach to Evaluating Large Language Model Confidence
BAS aggregates utility from an answer-or-abstain model across risk thresholds and is uniquely maximized by truthful confidence estimates.
-
Causal Evidence that Language Models use Confidence to Drive Behavior
Language models deploy multidimensional internal confidence representations and threshold-based policies to control abstention behavior, with causal support from activation steering experiments.