Recognition: unknown
FastAT Benchmark: A Comprehensive Framework for Fair Evaluation of Fast Adversarial Training Methods
Pith reviewed 2026-05-10 00:19 UTC · model grok-4.3
The pith
A controlled benchmark shows well-designed single-step adversarial training can match PGD-AT robustness at far lower cost.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The FastAT Benchmark enforces unified architecture, standardized training settings, and strict prohibition of external data to evaluate over twenty representative FastAT methods on dual metrics of robustness and cost. Comprehensive tests on CIFAR-10, CIFAR-100, and Tiny-ImageNet show that well-designed single-step methods can match or surpass PGD-AT robustness at substantially lower cost, while no single method dominates across all evaluation dimensions.
What carries the argument
The FastAT Benchmark, a controlled evaluation framework that standardizes model architecture, training hyperparameters, and data usage to produce directly comparable robustness and efficiency measurements for FastAT methods.
If this is right
- Well-designed single-step methods achieve comparable or superior robustness to PGD-AT at much lower training cost.
- No individual FastAT method leads on every combination of robustness metric and efficiency measure.
- Future FastAT proposals must be tested under the same unified architecture and no-external-data rules to claim genuine improvement.
- The public codebase allows any new method to be inserted and measured against the same baselines on the three datasets.
Where Pith is reading between the lines
- Practical deployment of robust models may shift toward single-step training when hardware budgets are limited.
- The benchmark could be extended to larger datasets or additional attack types to test whether the single-step advantage persists.
- Trade-off curves between robustness and cost may guide selection of methods for specific application constraints.
Load-bearing premise
The assumption that the chosen unified architecture and standardized settings produce conclusions that generalize beyond these benchmark conditions and that the re-implementations faithfully reproduce the original authors' intent.
What would settle it
If re-running the same methods under the benchmark rules but with a different base architecture or with external data allowed produces large changes in the relative rankings of robustness versus cost, the claim that the framework yields generalizable fair comparisons would be falsified.
Figures
read the original abstract
Fast Adversarial Training (FastAT) seeks to achieve adversarial robustness at a fraction of the computational cost incurred by standard multi-step methods such as PGD-AT. Although numerous FastAT techniques have been proposed in recent years, fair comparison among them remains elusive. Existing benchmarks and public leaderboards typically permit diverse model architectures, varying training configurations, and external data sources, making it unclear whether reported improvements reflect genuine algorithmic advances or merely more favorable experimental conditions. To address this problem, we introduce the FastAT Benchmark, a controlled evaluation framework built on three core design principles: unified architecture requirements, standardized training settings, and strict prohibition of external or synthetic data. The benchmark implements over twenty representative FastAT methods within a single codebase, enabling direct and reproducible comparison. Each method is assessed through a dual-metric evaluation framework that measures both adversarial robustness (accuracy under PGD, AutoAttack, and CR Attack) and computational cost (GPU training time and peak memory footprint). Comprehensive experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet provide reliable baseline measurements and reveal that well-designed single-step methods can match or surpass PGD-AT robustness at substantially lower cost, while no single method dominates across all evaluation dimensions. The complete benchmark, including source code, configuration files, and experimental results, is publicly available to support transparent and fair evaluation of future FastAT research.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces the FastAT Benchmark, a controlled evaluation framework for Fast Adversarial Training methods based on three design principles: unified architecture requirements, standardized training settings, and strict prohibition of external or synthetic data. It re-implements over twenty representative FastAT methods in a single codebase and evaluates them via a dual-metric framework measuring adversarial robustness (under PGD, AutoAttack, and CR Attack) alongside computational cost (GPU training time and peak memory). Experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet lead to the conclusion that well-designed single-step methods can match or surpass PGD-AT robustness at substantially lower cost, while no single method dominates across all evaluation dimensions. The benchmark code, configurations, and results are released publicly.
Significance. If the re-implementations are faithful and the standardization does not introduce systematic biases, this work offers a valuable contribution by establishing reproducible baselines that address the incomparability issues plaguing prior FastAT literature. The public release of the complete benchmark supports transparency and future research. The dual-metric evaluation and finding on competitive single-step methods have practical relevance for efficient robust training, and the absence of external data strengthens the controlled nature of the comparisons.
major comments (1)
- The central empirical claim—that well-designed single-step methods can match or surpass PGD-AT robustness—depends on the fidelity of the re-implementations under the unified architecture and hyperparameters. The manuscript should include a validation table (in the experimental results section) directly comparing each method's reproduced accuracy and cost metrics to the numbers originally reported in the source papers (using the originals' settings and architectures where feasible) to rule out that observed gaps or equivalences arise from re-implementation choices rather than algorithmic merit.
minor comments (2)
- The abstract introduces 'CR Attack' without definition or citation; this should be clarified with a brief description or reference in the introduction or methods section.
- The justification for the specific choice of unified architecture (e.g., which ResNet variant) and training schedule could be expanded to better address potential biases introduced by standardization.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and positive assessment of the FastAT Benchmark. We address the major comment below and commit to revising the manuscript accordingly.
read point-by-point responses
-
Referee: The central empirical claim—that well-designed single-step methods can match or surpass PGD-AT robustness—depends on the fidelity of the re-implementations under the unified architecture and hyperparameters. The manuscript should include a validation table (in the experimental results section) directly comparing each method's reproduced accuracy and cost metrics to the numbers originally reported in the source papers (using the originals' settings and architectures where feasible) to rule out that observed gaps or equivalences arise from re-implementation choices rather than algorithmic merit.
Authors: We agree that validating the fidelity of the re-implementations is important for supporting our central empirical claim. While the benchmark's core contribution is standardized evaluation under unified conditions, we will add a validation table in the experimental results section. For each method where the original architecture, hyperparameters, and reported numbers are available and feasible to reproduce, the table will directly compare our re-implemented metrics (accuracy and cost) against the source papers' originally reported values. Any discrepancies will be analyzed and discussed to confirm that our implementations faithfully capture the intended algorithms. This addition will strengthen the manuscript by ruling out re-implementation artifacts as the source of observed performance differences. revision: yes
Circularity Check
No circularity: purely empirical benchmark with no derivations or self-referential reductions
full rationale
The paper introduces an empirical benchmarking framework for FastAT methods. It specifies unified architecture, standardized hyperparameters, and no external data, then reports results from re-implementing 20+ methods under those controls. No equations, fitted parameters, predictions, or uniqueness theorems appear. Central claims rest on observed accuracy/cost trade-offs across CIFAR-10/100 and Tiny-ImageNet; these are direct experimental outputs, not reductions to prior self-citations or input fits. Self-citations, if present, are not load-bearing for the benchmark design or headline findings. The work is self-contained against external benchmarks and does not invoke any of the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Adversarial robustness is appropriately measured by clean accuracy together with accuracy under PGD, AutoAttack, and CR Attack
- domain assumption Forbidding external or synthetic data and fixing architecture plus training schedule removes confounding factors that previously prevented fair algorithmic comparison
Reference graph
Works this paper leans on
-
[1]
Towards Deep Learning Models Resistant to Adversarial Attacks
A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,”arXiv preprint arXiv:1706.06083, 2017
work page internal anchor Pith review arXiv 2017
-
[2]
Fast is better than free: Revisiting adversarial training
E. Wong, L. Rice, and J. Z. Kolter, “Fast is better than free: Revisiting adversarial training,”arXiv preprint arXiv:2001.03994, 2020
-
[3]
Make some noise: Reliable and efficient single-step adversar- ial training,
P. de Jorge Aranda, A. Bibi, R. Volpi, A. Sanyal, P. Torr, G. Rogez, and P. Dokania, “Make some noise: Reliable and efficient single-step adversar- ial training,”Advances in Neural Information Pro- cessing Systems, vol. 35, pp. 12 881–12 893, 2022
2022
-
[4]
Zerograd: Costless conscious remedies for catastrophic overfitting in the fgsm adversar- ial training,
Z. Golgooni, M. Saberi, M. Eskandar, and M. H. Rohban, “Zerograd: Costless conscious remedies for catastrophic overfitting in the fgsm adversar- ial training,”Intelligent Systems with Applications, vol. 19, p. 200258, 2023
2023
-
[5]
Prior-guided adversarial initializa- tion for fast adversarial training,
X. Jia, Y. Zhang, X. Wei, B. Wu, K. Ma, J. Wang, and X. Cao, “Prior-guided adversarial initializa- tion for fast adversarial training,” inEuropean Conference on Computer Vision. Springer, 2022, pp. 567–584
2022
-
[6]
Adversarial initial- ization with universal adversarial perturbation: A new approach to fast adversarial training,
C. Pan, Q. Li, and X. Yao, “Adversarial initial- ization with universal adversarial perturbation: A new approach to fast adversarial training,” inPro- ceedings of the AAAI Conference on Artificial In- telligence, vol. 38, no. 19, 2024, pp. 21 501–21 509
2024
-
[7]
Understanding catas- trophic overfitting in single-step adversarial train- ing,
H. Kim, W. Lee, and J. Lee, “Understanding catas- trophic overfitting in single-step adversarial train- ing,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 9, 2021, pp. 8119–8127
2021
-
[8]
Under- standing and improving fast adversarial training,
M. Andriushchenko and N. Flammarion, “Under- standing and improving fast adversarial training,” Advances in Neural Information Processing Sys- tems, vol. 33, pp. 16 048–16 059, 2020
2020
-
[9]
Guided adversarial attack for evaluating and en- hancing adversarial defenses,
G. Sriramanan, S. Addepalli, A. Baburajet al., “Guided adversarial attack for evaluating and en- hancing adversarial defenses,”Advances in Neu- ral Information Processing Systems, vol. 33, pp. 20 297–20 308, 2020
2020
-
[10]
Towards efficient and effective adversarial training,
——, “Towards efficient and effective adversarial training,”Advances in Neural Information Pro- cessing Systems, vol. 34, pp. 11 821–11 833, 2021
2021
-
[11]
Eliminating catas- trophic overfitting via abnormal adversarial exam- ples regularization,
R. Lin, C. Yu, and T. Liu, “Eliminating catas- trophic overfitting via abnormal adversarial exam- ples regularization,”Advances in Neural Informa- tion Processing Systems, vol. 36, pp. 67 866–67 885, 2023
2023
-
[12]
Fast adversarial training with smooth convergence,
M. Zhao, L. Zhang, Y. Kong, and B. Yin, “Fast adversarial training with smooth convergence,” in Proceedings of the IEEE/CVF International Con- ference on Computer Vision, 2023, pp. 4720–4729
2023
-
[13]
Pre- venting catastrophic overfitting in fast adversar- ial training: A bi-level optimization perspec- tive,
Z. Wang, H. Wang, C. Tian, and Y. Jin, “Pre- venting catastrophic overfitting in fast adversar- ial training: A bi-level optimization perspec- tive,” inEuropean Conference on Computer Vi- sion. Springer, 2024, pp. 144–160
2024
-
[14]
Efficient local linearity reg- ularization to overcome catastrophic overfitting,
E. A. Rocamora, F. Liu, G. G. Chrysos, P. M. Ol- mos, and V. Cevher, “Efficient local linearity reg- ularization to overcome catastrophic overfitting,” arXiv preprint arXiv:2401.11618, 2024
-
[15]
Mitigating catastrophic overfitting in fast adversarial train- ing via label information elimination,
C. Pan, K. Tang, Q. Li, and X. Yao, “Mitigating catastrophic overfitting in fast adversarial train- ing via label information elimination,” inProceed- ings of the IEEE/CVF International Conference on Computer Vision, 2025, pp. 2991–3000
2025
-
[16]
RobustBench: a standardized adversarial robustness benchmark,
F. Croce, M. Andriushchenko, V. Sehwag, E. Debenedetti, N. Flammarion, M. Chiang, P. Mittal, and P. Frossard, “RobustBench: a standardized adversarial robustness benchmark,” inNeurips 2020 Competition and Demonstration Track. PMLR, 2021, pp. 141–151
2020
-
[17]
Deep resid- ual learning for image recognition,
K. He, X. Zhang, S. Ren, and J. Sun, “Deep resid- ual learning for image recognition,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778
2016
-
[18]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
A. Dosovitskiy, “An image is worth 16x16 words: Transformers for image recognition at scale,”arXiv preprint arXiv:2010.11929, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[19]
S. Zagoruyko and N. Komodakis, “Wide residual networks,”arXiv preprint arXiv:1605.07146, 2016
work page internal anchor Pith review arXiv 2016
-
[20]
B. R. Bartoldson, J. Diffenderfer, K. Parasyris, and B. Kailkhura, “Adversarial robustness limits via scaling-law and human-alignment studies,”arXiv preprint arXiv:2404.09349, 2024
-
[21]
Robust principles: Architectural design princi- ples for adversarially robust cnns,
S. Peng, W. Xu, C. Cornelius, M. Hull, K. Li, R. Duggal, M. Phute, J. Martin, and D. H. Chau, “Robust principles: Architectural design princi- ples for adversarially robust cnns,”arXiv preprint arXiv:2308.16258, 2023
-
[22]
Rethink- ing robustbench: Is high synthetic-test data sim- ilarity an implicit information advantage inflating robustness scores?
C. Pan, K. Tang, Q. Li, and X. Yao, “Rethink- ing robustbench: Is high synthetic-test data sim- ilarity an implicit information advantage inflating robustness scores?” in2025 IEEE 12th Interna- tional Conference on Data Science and Advanced Analytics (DSAA). IEEE, 2025, pp. 1–10
2025
-
[23]
Reliable evaluation of ad- versarial robustness with an ensemble of diverse parameter-free attacks,
F. Croce and M. Hein, “Reliable evaluation of ad- versarial robustness with an ensemble of diverse parameter-free attacks,” inInternational Confer- ence on Machine Learning. PMLR, 2020, pp. 2206–2216
2020
-
[24]
Efficient robustness evaluation via constraint re- laxation,
C. Pan, Y. Wu, K. Tang, Q. Li, and X. Yao, “Efficient robustness evaluation via constraint re- laxation,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 6, 2025, pp. 6263–6271
2025
-
[25]
Explaining and Harnessing Adversarial Examples
I. J. Goodfellow, J. Shlens, and C. Szegedy, “Ex- plaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014
work page internal anchor Pith review arXiv 2014
-
[26]
Adversarial training for free!
A. Shafahi, M. Najibi, A. Ghiasi, Z. Xu, J. Dicker- son, C. Studer, L. S. Davis, G. Taylor, and T. Gold- stein, “Adversarial training for free!” inAdvances in Neural Information Processing Systems, vol. 32, 2019, pp. 3353–3364
2019
-
[27]
Averaging Weights Leads to Wider Optima and Better Generalization
P. Izmailov, D. Podoprikhin, T. Garipov, D. Vetrov, and A. G. Wilson, “Averaging weights leads to wider optima and better generalization,” arXiv preprint arXiv:1803.05407, 2018
work page Pith review arXiv 2018
-
[28]
Learning multiple layers of features from tiny images,
A. Krizhevsky, G. Hintonet al., “Learning multiple layers of features from tiny images,” 2009
2009
-
[29]
Imagenet: A large-scale hierarchi- cal image database,
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchi- cal image database,” in2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2009, pp. 248–255
2009
-
[30]
Identity mappings in deep residual networks,
K. He, X. Zhang, S. Ren, and J. Sun, “Identity mappings in deep residual networks,” inEuropean Conference on Computer Vision. Springer, 2016, pp. 630–645
2016
-
[31]
Adversarial ro- bustness via label-smoothing,
M. Goibert and E. Dohmatob, “Adversarial ro- bustness via label-smoothing,”arXiv preprint arXiv:1906.11567, 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.