Confidence Before Answering: A Paradigm Shift for Efficient LLM Uncertainty Estimation

· 2026 · cs.CL · arXiv 2603.05881

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Reliable deployment of large language models (LLMs) requires accurate uncertainty estimation. Existing methods are predominantly answer-first, producing confidence only after generating an answer, which measure the correctness of a specific response and limits practical usability. We study a confidence-first paradigm, where the model outputs its confidence before answering, interpreting this score as the model's probability of answering the question correctly under its current policy. We propose CoCA(Co-optimized Confidence and Answers), a GRPO reinforcement learning framework that jointly optimizes confidence calibration and answer accuracy via segmented credit assignment. By assigning separate rewards and group-relative advantages to confidence and answer segments, CoCA enables stable joint optimization and avoids reward hacking. Experiments across math, code, and factual QA benchmarks show improved calibration and uncertainty discrimination while preserving answer quality, thereby enabling a broader range of downstream applications.

representative citing papers

CALIBER: Calibrating Confidence Before and After Reasoning in Language Models

cs.CL · 2026-06-23 · unverdicted · novelty 6.0

CALIBER elicits and supervises pre-reasoning confidence with prompt-level success probability and post-reasoning confidence with answer-level correctness, cutting ECE by 52.5% on BigMathDigits for a 7B model while remaining competitive on accuracy.

citing papers explorer

Showing 1 of 1 citing paper.

CALIBER: Calibrating Confidence Before and After Reasoning in Language Models cs.CL · 2026-06-23 · unverdicted · none · ref 13 · internal anchor
CALIBER elicits and supervises pre-reasoning confidence with prompt-level success probability and post-reasoning confidence with answer-level correctness, cutting ECE by 52.5% on BigMathDigits for a 7B model while remaining competitive on accuracy.

Confidence Before Answering: A Paradigm Shift for Efficient LLM Uncertainty Estimation

fields

years

verdicts

representative citing papers

citing papers explorer