Logprobs explained only 4.9% of variance in within-run verbal con- fidence (r= 0.23 , R2 CV = 0.049 ) and 8.4% in cross-run verbal confidence (r= 0.29 , R2 CV = 0.084)

Phase 1 (different run with identical questions but answers provided in the prompt) · 2023

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

cs.CL · 2026-03-18 · unverdicted · novelty 6.0

Mechanistic experiments on Gemma 3 27B, Qwen 2.5 7B and Magistral Small 24B show verbal confidence is cached at post-answer positions from answer tokens and captures richer answer-quality information beyond token log-probabilities.

citing papers explorer

Showing 1 of 1 citing paper.

How do LLMs Compute Verbal Confidence cs.CL · 2026-03-18 · unverdicted · none · ref 29
Mechanistic experiments on Gemma 3 27B, Qwen 2.5 7B and Magistral Small 24B show verbal confidence is cached at post-answer positions from answer tokens and captures richer answer-quality information beyond token log-probabilities.

Logprobs explained only 4.9% of variance in within-run verbal con- fidence (r= 0.23 , R2 CV = 0.049 ) and 8.4% in cross-run verbal confidence (r= 0.29 , R2 CV = 0.084)

fields

years

verdicts

representative citing papers

citing papers explorer