CGES: Confidence-Guided Early Stopping for Efficient and Accurate Self-Consistency

Ahmad Ghasemi; Ehsan Aghazadeh; Hedyeh Beyhaghi; Hossein Pishro-Nik

arxiv: 2511.02603 · v2 · pith:EULKGU3Jnew · submitted 2025-11-04 · 💻 cs.CL

CGES: Confidence-Guided Early Stopping for Efficient and Accurate Self-Consistency

Ehsan Aghazadeh , Ahmad Ghasemi , Hedyeh Beyhaghi , Hossein Pishro-Nik This is my paper

classification 💻 cs.CL

keywords cgesself-consistencyansweraveragecallsconfidence-guidedearlynumber

0 comments

read the original abstract

Large language models (LLMs) are often queried multiple times at test time, with predictions aggregated by majority vote. While effective, this self-consistency (Wang et al., 2023) strategy requires a fixed number of calls and fails when the correct answer is infrequent. We introduce Confidence-Guided Early Stopping (CGES), a Bayesian framework that forms posteriors over candidate answers and adaptively halts sampling once one answer accumulates enough posterior mass. We prove guarantees in both an ideal calibrated regime and a realistic noisy-confidence regime under a directional drift condition. Averaged over five reasoning benchmarks, CGES reduces the average number of calls by 58% on average (from 16.0 to 6.7) while matching its accuracy within 0.4 percentage points of self-consistency.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

CITE: Anytime-Valid Statistical Inference in LLM Self-Consistency
stat.ML 2026-05 unverdicted novelty 7.0

CITE certifies that a prespecified answer is the unique mode of an LLM response distribution with anytime-valid error control under arbitrary data-driven stopping and without prior knowledge of the answer set.