Large-Margin Softmax Loss for Convolutional Neural Networks

Weiyang Liu , Yandong Wen , Zhiding Yu , Meng Yang

Authors on Pith no claims yet

classification 📊 stat.ML cs.LG

keywords lossl-softmaxfeaturessoftmaxconvolutionaldiscriminativeexplicitlylarge-margin

read the original abstract

Cross-entropy loss together with softmax is arguably one of the most common used supervision components in convolutional neural networks (CNNs). Despite its simplicity, popularity and excellent performance, the component does not explicitly encourage discriminative learning of features. In this paper, we propose a generalized large-margin softmax (L-Softmax) loss which explicitly encourages intra-class compactness and inter-class separability between learned features. Moreover, L-Softmax not only can adjust the desired margin but also can avoid overfitting. We also show that the L-Softmax loss can be optimized by typical stochastic gradient descent. Extensive experiments on four benchmark datasets demonstrate that the deeply-learned features with L-softmax loss become more discriminative, hence significantly boosting the performance on a variety of visual classification and verification tasks.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Silhouette Loss: Differentiable Global Structure Learning for Deep Representations
cs.LG 2026-03 unverdicted novelty 8.0

Soft Silhouette Loss offers a batch-global differentiable metric to promote intra-class compactness and inter-class separation in learned representations, boosting performance when hybridized with cross-entropy and co...
Quantum Interval Bound Propagation for Certified Training of Quantum Neural Networks
quant-ph 2026-05 unverdicted novelty 7.0

QIBP adapts interval bound propagation to quantum neural networks for certified adversarial robustness via interval and affine arithmetic implementations.
Evidential Transformation Network: Turning Pretrained Models into Evidential Models for Post-hoc Uncertainty Estimation
cs.LG 2026-04 unverdicted novelty 7.0

ETN is a lightweight post-hoc module that applies a learned sample-dependent affine transformation to pretrained model logits and interprets the outputs as Dirichlet parameters to enable efficient uncertainty estimation.
Distribution-Free Pretraining of Classification Losses via Evolutionary Dynamics
cs.LG 2026-05 unverdicted novelty 6.0

EDL learns a transferable classification loss from unlimited synthetic data via evolutionary optimization and a ranking-consistency objective, serving as a competitive drop-in replacement for cross-entropy on CIFAR-10...