arxiv: 2604.22853 · v1 · submitted 2026-04-22 · 💻 cs.CV · cs.LG

Recognition: unknown

FastAT Benchmark: A Comprehensive Framework for Fair Evaluation of Fast Adversarial Training Methods

Chao Pan, Xin Yao

Authors on Pith no claims yet

Pith reviewed 2026-05-10 00:19 UTC · model grok-4.3

classification 💻 cs.CV cs.LG

keywords fast adversarial trainingadversarial robustnessbenchmarksingle-step methodsPGD-ATcomputational efficiencyCIFAR-10CIFAR-100

0 comments

The pith

A controlled benchmark shows well-designed single-step adversarial training can match PGD-AT robustness at far lower cost.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper creates the FastAT Benchmark to enable fair comparisons among FastAT methods by requiring the same model architecture, identical training hyperparameters, and no external or synthetic data. It reimplements over twenty methods in one codebase and measures each on both adversarial robustness under PGD, AutoAttack, and CR Attack plus computational cost via training time and memory. Experiments across CIFAR-10, CIFAR-100, and Tiny-ImageNet indicate that certain single-step approaches reach or exceed the robustness of standard multi-step PGD-AT while consuming substantially less GPU time. This matters because earlier comparisons mixed architectures, schedules, and data sources, leaving unclear which gains came from the algorithm itself. The released code and results establish a common baseline for evaluating future FastAT proposals.

Core claim

The FastAT Benchmark enforces unified architecture, standardized training settings, and strict prohibition of external data to evaluate over twenty representative FastAT methods on dual metrics of robustness and cost. Comprehensive tests on CIFAR-10, CIFAR-100, and Tiny-ImageNet show that well-designed single-step methods can match or surpass PGD-AT robustness at substantially lower cost, while no single method dominates across all evaluation dimensions.

What carries the argument

The FastAT Benchmark, a controlled evaluation framework that standardizes model architecture, training hyperparameters, and data usage to produce directly comparable robustness and efficiency measurements for FastAT methods.

If this is right

Well-designed single-step methods achieve comparable or superior robustness to PGD-AT at much lower training cost.
No individual FastAT method leads on every combination of robustness metric and efficiency measure.
Future FastAT proposals must be tested under the same unified architecture and no-external-data rules to claim genuine improvement.
The public codebase allows any new method to be inserted and measured against the same baselines on the three datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Practical deployment of robust models may shift toward single-step training when hardware budgets are limited.
The benchmark could be extended to larger datasets or additional attack types to test whether the single-step advantage persists.
Trade-off curves between robustness and cost may guide selection of methods for specific application constraints.

Load-bearing premise

The assumption that the chosen unified architecture and standardized settings produce conclusions that generalize beyond these benchmark conditions and that the re-implementations faithfully reproduce the original authors' intent.

What would settle it

If re-running the same methods under the benchmark rules but with a different base architecture or with external data allowed produces large changes in the relative rankings of robustness versus cost, the claim that the framework yields generalizable fair comparisons would be falsified.

Figures

Figures reproduced from arXiv: 2604.22853 by Chao Pan, Xin Yao.

**Figure 2.** Figure 2: Pareto frontiers illustrating the trade-off between training time and AutoAttack accuracy. Star markers [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

read the original abstract

Fast Adversarial Training (FastAT) seeks to achieve adversarial robustness at a fraction of the computational cost incurred by standard multi-step methods such as PGD-AT. Although numerous FastAT techniques have been proposed in recent years, fair comparison among them remains elusive. Existing benchmarks and public leaderboards typically permit diverse model architectures, varying training configurations, and external data sources, making it unclear whether reported improvements reflect genuine algorithmic advances or merely more favorable experimental conditions. To address this problem, we introduce the FastAT Benchmark, a controlled evaluation framework built on three core design principles: unified architecture requirements, standardized training settings, and strict prohibition of external or synthetic data. The benchmark implements over twenty representative FastAT methods within a single codebase, enabling direct and reproducible comparison. Each method is assessed through a dual-metric evaluation framework that measures both adversarial robustness (accuracy under PGD, AutoAttack, and CR Attack) and computational cost (GPU training time and peak memory footprint). Comprehensive experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet provide reliable baseline measurements and reveal that well-designed single-step methods can match or surpass PGD-AT robustness at substantially lower cost, while no single method dominates across all evaluation dimensions. The complete benchmark, including source code, configuration files, and experimental results, is publicly available to support transparent and fair evaluation of future FastAT research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper introduces the FastAT Benchmark, a controlled evaluation framework for Fast Adversarial Training methods based on three design principles: unified architecture requirements, standardized training settings, and strict prohibition of external or synthetic data. It re-implements over twenty representative FastAT methods in a single codebase and evaluates them via a dual-metric framework measuring adversarial robustness (under PGD, AutoAttack, and CR Attack) alongside computational cost (GPU training time and peak memory). Experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet lead to the conclusion that well-designed single-step methods can match or surpass PGD-AT robustness at substantially lower cost, while no single method dominates across all evaluation dimensions. The benchmark code, configurations, and results are released publicly.

Significance. If the re-implementations are faithful and the standardization does not introduce systematic biases, this work offers a valuable contribution by establishing reproducible baselines that address the incomparability issues plaguing prior FastAT literature. The public release of the complete benchmark supports transparency and future research. The dual-metric evaluation and finding on competitive single-step methods have practical relevance for efficient robust training, and the absence of external data strengthens the controlled nature of the comparisons.

major comments (1)

The central empirical claim—that well-designed single-step methods can match or surpass PGD-AT robustness—depends on the fidelity of the re-implementations under the unified architecture and hyperparameters. The manuscript should include a validation table (in the experimental results section) directly comparing each method's reproduced accuracy and cost metrics to the numbers originally reported in the source papers (using the originals' settings and architectures where feasible) to rule out that observed gaps or equivalences arise from re-implementation choices rather than algorithmic merit.

minor comments (2)

The abstract introduces 'CR Attack' without definition or citation; this should be clarified with a brief description or reference in the introduction or methods section.
The justification for the specific choice of unified architecture (e.g., which ResNet variant) and training schedule could be expanded to better address potential biases introduced by standardization.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback and positive assessment of the FastAT Benchmark. We address the major comment below and commit to revising the manuscript accordingly.

read point-by-point responses

Referee: The central empirical claim—that well-designed single-step methods can match or surpass PGD-AT robustness—depends on the fidelity of the re-implementations under the unified architecture and hyperparameters. The manuscript should include a validation table (in the experimental results section) directly comparing each method's reproduced accuracy and cost metrics to the numbers originally reported in the source papers (using the originals' settings and architectures where feasible) to rule out that observed gaps or equivalences arise from re-implementation choices rather than algorithmic merit.

Authors: We agree that validating the fidelity of the re-implementations is important for supporting our central empirical claim. While the benchmark's core contribution is standardized evaluation under unified conditions, we will add a validation table in the experimental results section. For each method where the original architecture, hyperparameters, and reported numbers are available and feasible to reproduce, the table will directly compare our re-implemented metrics (accuracy and cost) against the source papers' originally reported values. Any discrepancies will be analyzed and discussed to confirm that our implementations faithfully capture the intended algorithms. This addition will strengthen the manuscript by ruling out re-implementation artifacts as the source of observed performance differences. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical benchmark with no derivations or self-referential reductions

full rationale

The paper introduces an empirical benchmarking framework for FastAT methods. It specifies unified architecture, standardized hyperparameters, and no external data, then reports results from re-implementing 20+ methods under those controls. No equations, fitted parameters, predictions, or uniqueness theorems appear. Central claims rest on observed accuracy/cost trade-offs across CIFAR-10/100 and Tiny-ImageNet; these are direct experimental outputs, not reductions to prior self-citations or input fits. Self-citations, if present, are not load-bearing for the benchmark design or headline findings. The work is self-contained against external benchmarks and does not invoke any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The work rests on standard domain practices in adversarial machine learning rather than new axioms or invented entities.

axioms (2)

domain assumption Adversarial robustness is appropriately measured by clean accuracy together with accuracy under PGD, AutoAttack, and CR Attack
These attacks are treated as standard evaluation tools throughout the adversarial training literature.
domain assumption Forbidding external or synthetic data and fixing architecture plus training schedule removes confounding factors that previously prevented fair algorithmic comparison
This is the central design premise stated in the abstract.

pith-pipeline@v0.9.0 · 5544 in / 1531 out tokens · 79252 ms · 2026-05-10T00:19:31.645104+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

31 extracted references · 10 canonical work pages · 4 internal anchors

[1]

Towards Deep Learning Models Resistant to Adversarial Attacks

A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,”arXiv preprint arXiv:1706.06083, 2017

work page internal anchor Pith review arXiv 2017
[2]

Fast is better than free: Revisiting adversarial training

E. Wong, L. Rice, and J. Z. Kolter, “Fast is better than free: Revisiting adversarial training,”arXiv preprint arXiv:2001.03994, 2020

work page arXiv 2001
[3]

Make some noise: Reliable and efficient single-step adversar- ial training,

P. de Jorge Aranda, A. Bibi, R. Volpi, A. Sanyal, P. Torr, G. Rogez, and P. Dokania, “Make some noise: Reliable and efficient single-step adversar- ial training,”Advances in Neural Information Pro- cessing Systems, vol. 35, pp. 12 881–12 893, 2022

2022
[4]

Zerograd: Costless conscious remedies for catastrophic overfitting in the fgsm adversar- ial training,

Z. Golgooni, M. Saberi, M. Eskandar, and M. H. Rohban, “Zerograd: Costless conscious remedies for catastrophic overfitting in the fgsm adversar- ial training,”Intelligent Systems with Applications, vol. 19, p. 200258, 2023

2023
[5]

Prior-guided adversarial initializa- tion for fast adversarial training,

X. Jia, Y. Zhang, X. Wei, B. Wu, K. Ma, J. Wang, and X. Cao, “Prior-guided adversarial initializa- tion for fast adversarial training,” inEuropean Conference on Computer Vision. Springer, 2022, pp. 567–584

2022
[6]

Adversarial initial- ization with universal adversarial perturbation: A new approach to fast adversarial training,

C. Pan, Q. Li, and X. Yao, “Adversarial initial- ization with universal adversarial perturbation: A new approach to fast adversarial training,” inPro- ceedings of the AAAI Conference on Artificial In- telligence, vol. 38, no. 19, 2024, pp. 21 501–21 509

2024
[7]

Understanding catas- trophic overfitting in single-step adversarial train- ing,

H. Kim, W. Lee, and J. Lee, “Understanding catas- trophic overfitting in single-step adversarial train- ing,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 9, 2021, pp. 8119–8127

2021
[8]

Under- standing and improving fast adversarial training,

M. Andriushchenko and N. Flammarion, “Under- standing and improving fast adversarial training,” Advances in Neural Information Processing Sys- tems, vol. 33, pp. 16 048–16 059, 2020

2020
[9]

Guided adversarial attack for evaluating and en- hancing adversarial defenses,

G. Sriramanan, S. Addepalli, A. Baburajet al., “Guided adversarial attack for evaluating and en- hancing adversarial defenses,”Advances in Neu- ral Information Processing Systems, vol. 33, pp. 20 297–20 308, 2020

2020
[10]

Towards efficient and effective adversarial training,

——, “Towards efficient and effective adversarial training,”Advances in Neural Information Pro- cessing Systems, vol. 34, pp. 11 821–11 833, 2021

2021
[11]

Eliminating catas- trophic overfitting via abnormal adversarial exam- ples regularization,

R. Lin, C. Yu, and T. Liu, “Eliminating catas- trophic overfitting via abnormal adversarial exam- ples regularization,”Advances in Neural Informa- tion Processing Systems, vol. 36, pp. 67 866–67 885, 2023

2023
[12]

Fast adversarial training with smooth convergence,

M. Zhao, L. Zhang, Y. Kong, and B. Yin, “Fast adversarial training with smooth convergence,” in Proceedings of the IEEE/CVF International Con- ference on Computer Vision, 2023, pp. 4720–4729

2023
[13]

Pre- venting catastrophic overfitting in fast adversar- ial training: A bi-level optimization perspec- tive,

Z. Wang, H. Wang, C. Tian, and Y. Jin, “Pre- venting catastrophic overfitting in fast adversar- ial training: A bi-level optimization perspec- tive,” inEuropean Conference on Computer Vi- sion. Springer, 2024, pp. 144–160

2024
[14]

Efficient local linearity reg- ularization to overcome catastrophic overfitting,

E. A. Rocamora, F. Liu, G. G. Chrysos, P. M. Ol- mos, and V. Cevher, “Efficient local linearity reg- ularization to overcome catastrophic overfitting,” arXiv preprint arXiv:2401.11618, 2024

work page arXiv 2024
[15]

Mitigating catastrophic overfitting in fast adversarial train- ing via label information elimination,

C. Pan, K. Tang, Q. Li, and X. Yao, “Mitigating catastrophic overfitting in fast adversarial train- ing via label information elimination,” inProceed- ings of the IEEE/CVF International Conference on Computer Vision, 2025, pp. 2991–3000

2025
[16]

RobustBench: a standardized adversarial robustness benchmark,

F. Croce, M. Andriushchenko, V. Sehwag, E. Debenedetti, N. Flammarion, M. Chiang, P. Mittal, and P. Frossard, “RobustBench: a standardized adversarial robustness benchmark,” inNeurips 2020 Competition and Demonstration Track. PMLR, 2021, pp. 141–151

2020
[17]

Deep resid- ual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep resid- ual learning for image recognition,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778

2016
[18]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

A. Dosovitskiy, “An image is worth 16x16 words: Transformers for image recognition at scale,”arXiv preprint arXiv:2010.11929, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010
[19]

Wide Residual Networks

S. Zagoruyko and N. Komodakis, “Wide residual networks,”arXiv preprint arXiv:1605.07146, 2016

work page internal anchor Pith review arXiv 2016
[20]

Adversarial ro- bustness limits via scaling-law and human-alignment studies.arXiv preprint arXiv:2404.09349, 2024

B. R. Bartoldson, J. Diffenderfer, K. Parasyris, and B. Kailkhura, “Adversarial robustness limits via scaling-law and human-alignment studies,”arXiv preprint arXiv:2404.09349, 2024

work page arXiv 2024
[21]

Robust principles: Architectural design princi- ples for adversarially robust cnns,

S. Peng, W. Xu, C. Cornelius, M. Hull, K. Li, R. Duggal, M. Phute, J. Martin, and D. H. Chau, “Robust principles: Architectural design princi- ples for adversarially robust cnns,”arXiv preprint arXiv:2308.16258, 2023

work page arXiv 2023
[22]

Rethink- ing robustbench: Is high synthetic-test data sim- ilarity an implicit information advantage inflating robustness scores?

C. Pan, K. Tang, Q. Li, and X. Yao, “Rethink- ing robustbench: Is high synthetic-test data sim- ilarity an implicit information advantage inflating robustness scores?” in2025 IEEE 12th Interna- tional Conference on Data Science and Advanced Analytics (DSAA). IEEE, 2025, pp. 1–10

2025
[23]

Reliable evaluation of ad- versarial robustness with an ensemble of diverse parameter-free attacks,

F. Croce and M. Hein, “Reliable evaluation of ad- versarial robustness with an ensemble of diverse parameter-free attacks,” inInternational Confer- ence on Machine Learning. PMLR, 2020, pp. 2206–2216

2020
[24]

Efficient robustness evaluation via constraint re- laxation,

C. Pan, Y. Wu, K. Tang, Q. Li, and X. Yao, “Efficient robustness evaluation via constraint re- laxation,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 6, 2025, pp. 6263–6271

2025
[25]

Explaining and Harnessing Adversarial Examples

I. J. Goodfellow, J. Shlens, and C. Szegedy, “Ex- plaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014

work page internal anchor Pith review arXiv 2014
[26]

Adversarial training for free!

A. Shafahi, M. Najibi, A. Ghiasi, Z. Xu, J. Dicker- son, C. Studer, L. S. Davis, G. Taylor, and T. Gold- stein, “Adversarial training for free!” inAdvances in Neural Information Processing Systems, vol. 32, 2019, pp. 3353–3364

2019
[27]

Averaging Weights Leads to Wider Optima and Better Generalization

P. Izmailov, D. Podoprikhin, T. Garipov, D. Vetrov, and A. G. Wilson, “Averaging weights leads to wider optima and better generalization,” arXiv preprint arXiv:1803.05407, 2018

work page Pith review arXiv 2018
[28]

Learning multiple layers of features from tiny images,

A. Krizhevsky, G. Hintonet al., “Learning multiple layers of features from tiny images,” 2009

2009
[29]

Imagenet: A large-scale hierarchi- cal image database,

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchi- cal image database,” in2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2009, pp. 248–255

2009
[30]

Identity mappings in deep residual networks,

K. He, X. Zhang, S. Ren, and J. Sun, “Identity mappings in deep residual networks,” inEuropean Conference on Computer Vision. Springer, 2016, pp. 630–645

2016
[31]

Adversarial ro- bustness via label-smoothing,

M. Goibert and E. Dohmatob, “Adversarial ro- bustness via label-smoothing,”arXiv preprint arXiv:1906.11567, 2019

work page arXiv 1906