Do llms know when to not answer? investigating abstention abilities of large language models.arXiv preprint arXiv:2407.16221

Nishanth Madhusudhan, Sathwik Tejaswi Madhusudhan, Vikas Yadav, Masoud Hashemi · 2024 · arXiv 2407.16221

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Partial Evidence Bench: Benchmarking Authorization-Limited Evidence in Agentic Systems

cs.AI · 2026-05-06 · unverdicted · novelty 7.0

Partial Evidence Bench is a deterministic benchmark that measures agent correctness, completeness awareness, gap-report quality, and unsafe overclaiming in authorization-constrained evidence environments.

Don't Start What You Can't Finish: A Counterfactual Audit of Support-State Triage in LLM Agents

cs.AI · 2026-04-17 · unverdicted · novelty 7.0

LLM agents overcommit on non-complete tasks at 41.7% unless given explicit support-state categories, which raise typed deferral accuracy to 91.7%.

BAS: A Decision-Theoretic Approach to Evaluating Large Language Model Confidence

cs.CL · 2026-04-03 · unverdicted · novelty 7.0

BAS aggregates utility from an answer-or-abstain model across risk thresholds and is uniquely maximized by truthful confidence estimates.

Causal Evidence that Language Models use Confidence to Drive Behavior

cs.LG · 2026-03-23 · unverdicted · novelty 6.0

Language models deploy multidimensional internal confidence representations and threshold-based policies to control abstention behavior, with causal support from activation steering experiments.

citing papers explorer

Showing 4 of 4 citing papers.

Partial Evidence Bench: Benchmarking Authorization-Limited Evidence in Agentic Systems cs.AI · 2026-05-06 · unverdicted · none · ref 11
Partial Evidence Bench is a deterministic benchmark that measures agent correctness, completeness awareness, gap-report quality, and unsafe overclaiming in authorization-constrained evidence environments.
Don't Start What You Can't Finish: A Counterfactual Audit of Support-State Triage in LLM Agents cs.AI · 2026-04-17 · unverdicted · none · ref 3
LLM agents overcommit on non-complete tasks at 41.7% unless given explicit support-state categories, which raise typed deferral accuracy to 91.7%.
BAS: A Decision-Theoretic Approach to Evaluating Large Language Model Confidence cs.CL · 2026-04-03 · unverdicted · none · ref 35
BAS aggregates utility from an answer-or-abstain model across risk thresholds and is uniquely maximized by truthful confidence estimates.
Causal Evidence that Language Models use Confidence to Drive Behavior cs.LG · 2026-03-23 · unverdicted · none · ref 11
Language models deploy multidimensional internal confidence representations and threshold-based policies to control abstention behavior, with causal support from activation steering experiments.

Do llms know when to not answer? investigating abstention abilities of large language models.arXiv preprint arXiv:2407.16221

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer