SCA framework applies Information Bottleneck to assign step-level confidence in black-box LLM reasoning traces, flagging errors and boosting self-correction success by up to 13.5% on math and QA tasks.
The Eleventh International Conference on Learning Representations , year=
7 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 7representative citing papers
Proxy metrics from next-token distributions over expert solutions outperform loss and compute baselines for ranking LLMs, selecting pretraining data, and extrapolating performance across compute scales.
LaaB improves LLM hallucination detection by mapping self-judgment labels back into neural feature space and using mutual learning under logical consistency constraints between responses and meta-judgments.
Social identity markers in medical questions degrade LLM accuracy and uncertainty calibration, producing a calibration crisis that is non-additive for intersectional cases.
SelfCheckGPT detects hallucinations by checking consistency across multiple sampled responses from black-box LLMs on WikiBio biography generation tasks.
Mainstream UQ for LLMs reduces to unsupervised clustering of internal generation consistency and therefore cannot detect confident hallucinations or provide reliable safety signals.
A geometric 1-form on token embeddings has curvature that couples to semantic world models in language models, as evidenced by clustering on chess board regions and piece importance.
citing papers explorer
-
Diagnosing Multi-step Reasoning Failures in Black-box LLMs via Stepwise Confidence Attribution
SCA framework applies Information Bottleneck to assign step-level confidence in black-box LLM reasoning traces, flagging errors and boosting self-correction success by up to 13.5% on math and QA tasks.
-
Forecasting Downstream Performance of LLMs With Proxy Metrics
Proxy metrics from next-token distributions over expert solutions outperform loss and compute baselines for ranking LLMs, selecting pretraining data, and extrapolating performance across compute scales.
-
Logical Consistency as a Bridge: Improving LLM Hallucination Detection via Label Constraint Modeling between Responses and Self-Judgments
LaaB improves LLM hallucination detection by mapping self-judgment labels back into neural feature space and using mutual learning under logical consistency constraints between responses and meta-judgments.
-
Calibrated? Not for Everyone: How Sexual Orientation and Religious Markers Distort LLM Accuracy and Confidence in Medical QA
Social identity markers in medical questions degrade LLM accuracy and uncertainty calibration, producing a calibration crisis that is non-additive for intersectional cases.
-
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
SelfCheckGPT detects hallucinations by checking consistency across multiple sampled responses from black-box LLMs on WikiBio biography generation tasks.
-
Position: Uncertainty Quantification in LLMs is Just Unsupervised Clustering
Mainstream UQ for LLMs reduces to unsupervised clustering of internal generation consistency and therefore cannot detect confident hallucinations or provide reliable safety signals.
-
A geometric relation of the error introduced by sampling a language model's output distribution to its internal state
A geometric 1-form on token embeddings has curvature that couples to semantic world models in language models, as evidenced by clustering on chess board regions and piece importance.