Lower Bounds on Adversarial Robustness from Optimal Transport

Arjun Nitin Bhagoji; Daniel Cullina; Prateek Mittal

arxiv: 1909.12272 · v2 · pith:BXCYOCZKnew · submitted 2019-09-26 · 💻 cs.LG · cs.CR· stat.ML

Lower Bounds on Adversarial Robustness from Optimal Transport

Arjun Nitin Bhagoji , Daniel Cullina , Prateek Mittal This is my paper

classification 💻 cs.LG cs.CRstat.ML

keywords optimalclassificationclassifiercostexampleminimumtransportadversarial

0 comments

read the original abstract

While progress has been made in understanding the robustness of machine learning classifiers to test-time adversaries (evasion attacks), fundamental questions remain unresolved. In this paper, we use optimal transport to characterize the minimum possible loss in an adversarial classification scenario. In this setting, an adversary receives a random labeled example from one of two classes, perturbs the example subject to a neighborhood constraint, and presents the modified example to the classifier. We define an appropriate cost function such that the minimum transportation cost between the distributions of the two classes determines the minimum $0-1$ loss for any classifier. When the classifier comes from a restricted hypothesis class, the optimal transportation cost provides a lower bound. We apply our framework to the case of Gaussian data with norm-bounded adversaries and explicitly show matching bounds for the classification and transport problems as well as the optimality of linear classifiers. We also characterize the sample complexity of learning in this setting, deriving and extending previously known results as a special case. Finally, we use our framework to study the gap between the optimal classification performance possible and that currently achieved by state-of-the-art robustly trained neural networks for datasets of interest, namely, MNIST, Fashion MNIST and CIFAR-10.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Homogenization of $\ell_2$-Adversarial Training in High-Dimensions: Exact Dynamics under Stochastic Gradient Descent
math.OC 2026-06 unverdicted novelty 7.0

Derives ODE deterministic equivalents and an adversarial homogenized SDE for SGD iterates in high-dim ℓ2-adversarial training, showing no constant learning rate ensures monotone descent for single-class adversarial le...