Cross-Entropy Loss and Low-Rank Features Have Responsibility for Adversarial Examples

Kamil Nar, Kannan Ramchandran, Orhan Ocal, S. Shankar Sastry

classification 💻 cs.LG stat.ML

keywords trainingadversarialdatasetfeatureslosscross-entropydifferentialexamples

read the original abstract

State-of-the-art neural networks are vulnerable to adversarial examples; they can easily misclassify inputs that are imperceptibly different than their training and test data. In this work, we establish that the use of cross-entropy loss function and the low-rank features of the training data have responsibility for the existence of these inputs. Based on this observation, we suggest that addressing adversarial examples requires rethinking the use of cross-entropy loss function and looking for an alternative that is more suited for minimization with low-rank features. In this direction, we present a training scheme called differential training, which uses a loss function defined on the differences between the features of points from opposite classes. We show that differential training can ensure a large margin between the decision boundary of the neural network and the points in the training dataset. This larger margin increases the amount of perturbation needed to flip the prediction of the classifier and makes it harder to find an adversarial example with small perturbations. We test differential training on a binary classification task with CIFAR-10 dataset and demonstrate that it radically reduces the ratio of images for which an adversarial example could be found -- not only in the training dataset, but in the test dataset as well.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Low Rank Adaptation for Adversarial Perturbation
cs.LG 2026-04 unverdicted novelty 7.0

Adversarial perturbations possess an inherently low-rank structure that enables more efficient and effective black-box adversarial attacks via subspace projection.