On calibration of modern neural networks

Chuan Guo, Geoff Pleiss, Yu Sun, Kilian Q Weinberger · 2017

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

browse 8 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Retrieval-Augmented Linguistic Calibration

cs.CL · 2026-05-19 · unverdicted · novelty 6.0

Presents a distributional model of linguistic confidence, Faithfulness Divergence metric, and RALC pipeline that boosts faithfulness and calibration on QA benchmarks across LLM families.

LiBaGS: Lightweight Boundary Gap Synthesis for Targeted Synthetic Data Selection

cs.LG · 2026-05-11 · unverdicted · novelty 6.0 · 2 refs

LiBaGS scores and selects synthetic data near decision boundaries using proximity, uncertainty, density, and validity, with boundary-gap allocation and marginal stopping to improve training accuracy.

Auditing Multimodal LLM Raters: Central Tendency Bias in Clinical Ordinal Scoring

cs.CV · 2026-05-11 · conditional · novelty 6.0

Multimodal LLMs exhibit central tendency bias when scoring ordinal clinical images, over-predicting low scores and under-predicting high scores even after prompt ablations.

Distributional Process Reward Models: Calibrated Prediction of Future Rewards via Conditional Optimal Transport

cs.LG · 2026-05-07 · unverdicted · novelty 6.0 · 2 refs

Conditional optimal transport is used to turn raw PRM outputs into monotonic quantile functions that improve calibration and downstream Best-of-N performance on MATH-500 and AIME.

SATTC: Structure-Aware Label-Free Test-Time Calibration for Cross-Subject EEG-to-Image Retrieval

cs.CV · 2026-03-21 · conditional · novelty 6.0

SATTC improves top-k accuracy in cross-subject EEG-to-image retrieval by fusing geometric whitening and structural nearest-neighbor experts on the similarity matrix without labels.

GrACE: A Generative Approach to Better Confidence Elicitation and Efficient Test-Time Scaling in Large Language Models

cs.CL · 2025-09-11 · unverdicted · novelty 6.0

GrACE is a fine-tuned generative method that uses similarity to a special token embedding for real-time calibrated confidence in LLMs and enables efficient confidence-based test-time scaling.

On the explainability of max-plus neural networks

cs.CV · 2026-04-27 · unverdicted · novelty 5.0

Max-plus neural networks enable tracing each output to one dominant neuron, allowing a pixel fragility measure that provides more useful explanations than SHAP or Integrated Gradients on medical images.

Revisiting Neural Activation Coverage for Uncertainty Estimation

cs.LG · 2026-04-24 · unverdicted · novelty 5.0

Neural activation coverage can be adapted to provide uncertainty estimates in regression that the authors' experiments show are more meaningful than Monte-Carlo Dropout.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Retrieval-Augmented Linguistic Calibration cs.CL · 2026-05-19 · unverdicted · none · ref 12
Presents a distributional model of linguistic confidence, Faithfulness Divergence metric, and RALC pipeline that boosts faithfulness and calibration on QA benchmarks across LLM families.
GrACE: A Generative Approach to Better Confidence Elicitation and Efficient Test-Time Scaling in Large Language Models cs.CL · 2025-09-11 · unverdicted · none · ref 9
GrACE is a fine-tuned generative method that uses similarity to a special token embedding for real-time calibrated confidence in LLMs and enables efficient confidence-based test-time scaling.

On calibration of modern neural networks

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer