pith. sign in

hub

A Survey of Con- fidence Estimation and Calibration in Large Language Models

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

hub tools

years

2026 10 2025 2

representative citing papers

Code Is More Than Text: Uncertainty Estimation for Code Generation

cs.CL · 2026-06-08 · unverdicted · novelty 6.0

Three code-specific uncertainty axes (lexical, algorithmic, functional) yield an ensemble that raises average AUROC from 0.696 to 0.776 across five code LLMs, with one single-pass signal matching multi-pass baselines at lower cost.

Quantifying Faithful Confidence Expression in Large Reasoning Models

cs.CL · 2026-06-02 · unverdicted · novelty 6.0

A new framework quantifies faithful confidence expression in large reasoning models by comparing linguistic decisiveness to token probabilities, hidden states, and response consistency, revealing it as a persistent challenge.

Efficient Test-Time Scaling via Temporal Reasoning Aggregation

cs.AI · 2026-04-19 · unverdicted · novelty 5.0

TRACE aggregates answer consistency and confidence trajectory over multiple reasoning steps to decide when to halt inference, reducing token usage by 25-30% while keeping accuracy within 1-2% of full reasoning.

citing papers explorer

Showing 12 of 12 citing papers.