A Survey of Con- fidence Estimation and Calibration in Large Language Models

Jiahui Geng, Fengyu Cai, Yuxia Wang, Heinz Koeppl, Preslav Nakov, Iryna Gurevych · 2024 · DOI 10.18653/v1/2024.naacl-long.366

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

open at publisher browse 7 citing papers

representative citing papers

When Embedding-Based Defenses Fail: Rethinking Safety in LLM-Based Multi-Agent Systems

cs.CR · 2026-05-01 · unverdicted · novelty 6.0

Embedding-based defenses fail against attacks that align malicious message embeddings with benign ones in LLM multi-agent systems, but token-level confidence scores improve robustness by enabling better pruning of suspicious messages.

Token-Level Density-Based Uncertainty Quantification Methods for Eliciting Truthfulness of Large Language Models

cs.CL · 2025-02-20 · unverdicted · novelty 6.0

Adapts multi-layer token-level Mahalanobis distance with supervised linear regression to yield improved uncertainty scores for LLM truthfulness tasks.

An Information-Geometric Framework for Stability Analysis of Large Language Models under Entropic Stress

cs.AI · 2026-04-27 · unverdicted · novelty 5.0

A thermodynamic-inspired information-geometric framework defines a composite LLM stability score that outperforms a utility-entropy baseline by 0.0299 on average across 80 observations, with gains increasing at higher entropy.

Efficient Test-Time Scaling via Temporal Reasoning Aggregation

cs.AI · 2026-04-19 · unverdicted · novelty 5.0

TRACE aggregates answer consistency and confidence trajectory over multiple reasoning steps to decide when to halt inference, reducing token usage by 25-30% while keeping accuracy within 1-2% of full reasoning.

Do Small Language Models Know When They're Wrong? Confidence-Based Cascade Scoring for Educational Assessment

cs.CY · 2026-03-29 · unverdicted · novelty 4.0

Verbalized confidence from small LMs enables cost-effective cascade routing for automated educational scoring, matching large-model accuracy at 76% lower cost when discrimination is strong.

Improving the Distributional Alignment of LLMs using Supervision

cs.CL · 2025-07-01 · unverdicted · novelty 4.0

Simple supervision improves LLM distributional alignment with diverse population groups on three datasets, with evaluation across multiple models and prompts providing a benchmark.

ECUAS$_n$: A family of metrics for principled evaluation of uncertainty-augmented systems

cs.AI · 2026-05-19 · 2 refs

citing papers explorer

Showing 1 of 1 citing paper after filters.

When Embedding-Based Defenses Fail: Rethinking Safety in LLM-Based Multi-Agent Systems cs.CR · 2026-05-01 · unverdicted · none · ref 37
Embedding-based defenses fail against attacks that align malicious message embeddings with benign ones in LLM multi-agent systems, but token-level confidence scores improve robustness by enabling better pruning of suspicious messages.

A Survey of Con- fidence Estimation and Calibration in Large Language Models

fields

years

verdicts

representative citing papers

citing papers explorer