Towards Understanding The Calibration Benefits of Sharpness-Aware Minimization

Chengli Tan; Guang Dai; Haishan Ye; Jiangshe Zhang; Junmin Liu; Yong Xu; Yubo Zhou; Yunda Hao; Zengjie Song; Zixiang Zhao

arxiv: 2505.23866 · v2 · pith:3WH35FDEnew · submitted 2025-05-29 · 💻 cs.LG · cs.AI

Towards Understanding The Calibration Benefits of Sharpness-Aware Minimization

Chengli Tan , Yubo Zhou , Haishan Ye , Guang Dai , Junmin Liu , Zengjie Song , Jiangshe Zhang , Zixiang Zhao

show 2 more authors

Yunda Hao Yong Xu

This is my paper

classification 💻 cs.LG cs.AI

keywords calibrationbenefitscsamerrorminimizationoverconfidencesharpness-awaretowards

0 comments

read the original abstract

Deep neural networks have been increasingly used in safety-critical applications such as medical diagnosis and autonomous driving. However, many studies suggest that they are prone to being poorly calibrated and have a propensity for overconfidence, which may have disastrous consequences. In this paper, unlike standard training such as stochastic gradient descent, we show that the recently proposed sharpness-aware minimization (SAM) counteracts this tendency towards overconfidence. The theoretical analysis suggests that SAM allows us to learn models that are already well-calibrated by implicitly maximizing the entropy of the predictive distribution. Inspired by this finding, we further propose a variant of SAM, coined as CSAM, to ameliorate model calibration. Extensive experiments on various datasets, including ImageNet-1K, demonstrate the benefits of SAM in reducing calibration error. Meanwhile, CSAM performs even better than SAM and consistently achieves lower calibration error than other approaches

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Improving Calibration in Test-Time Prompt Tuning for Vision-Language Models via Data-Free Flatness-Aware Prompt Pretraining
cs.CV 2026-04 unverdicted novelty 6.0

A data-free pretraining step that places prompts in flatter loss regions improves calibration and performance when used as initialization for test-time prompt tuning of vision-language models.