Recognition: unknown
Beyond Attack Success Rate: A Multi-Metric Evaluation of Adversarial Transferability in Medical Imaging Models
Pith reviewed 2026-05-10 10:59 UTC · model grok-4.3
The pith
Attack success rate alone misses key image quality factors in adversarial evaluations of medical imaging models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A systematic study across PathMNIST, DermaMNIST, RetinaMNIST, and CheXpert datasets demonstrates that perceptual and distortion metrics remain strongly associated with each other while exhibiting minimal correlation with attack success rate, and that this dissociation appears consistently for both CNN and Vision Transformer models when subjected to multiple attack methods at several perturbation budgets.
What carries the argument
Multi-metric evaluation that jointly tracks attack success rate together with PSNR, SSIM, and L2 perturbation magnitude to expose relationships among efficacy, visual quality, and transfer behavior.
If this is right
- Adversarial robustness assessments in medical AI must incorporate image quality and overhead measures alongside success rate to avoid underestimating real risk.
- The same dissociation between success rate and perceptual metrics holds for both convolutional networks and vision transformers.
- Attack transferability studies gain accuracy when they report perturbation strength and resulting image fidelity in addition to binary success.
- Clinical deployment decisions should draw on multi-metric profiles rather than success-rate rankings alone.
Where Pith is reading between the lines
- Attack methods that maximize success at low perceptual cost may transfer differently than those that simply achieve high success regardless of distortion.
- The observed metric dissociation could be tested by checking whether models trained to resist low-distortion attacks also resist high-distortion ones.
- Extending the same multi-metric protocol to natural-image benchmarks would reveal whether the pattern is specific to medical data or more general.
Load-bearing premise
The four chosen medical datasets, seven models, seven attack methods, and five perturbation budgets are representative enough to support the general claim that attack success rate alone is inadequate for evaluating adversarial transferability and robustness.
What would settle it
Finding a strong positive correlation between attack success rate and at least one perceptual or distortion metric when the same evaluation protocol is repeated on a previously untested medical imaging dataset or model architecture would contradict the reported pattern.
Figures
read the original abstract
While deep learning systems are becoming increasingly prevalent in medical image analysis, their vulnerabilities to adversarial perturbations raise serious concerns for clinical deployment. These vulnerability evaluations largely rely on Attack Success Rate (ASR), a binary metric that indicates solely whether an attack is successful. However, the ASR metric does not account for other factors, such as perturbation strength, perceptual image quality, and cross-architecture attack transferability, and therefore, the interpretation is incomplete. This gap requires consideration, as complex, large-scale deep learning systems, including Vision Transformers (ViTs), are increasingly challenging the dominance of Convolutional Neural Networks (CNNs). These architectures learn differently, and it is unclear whether a single metric, e.g., ASR, can effectively capture adversarial behavior. To address this, we perform a systematic empirical study on four medical image datasets: PathMNIST, DermaMNIST, RetinaMNIST, and CheXpert. We evaluate seven models (VGG-16, ResNet-50, DenseNet-121, Inception-v3, DeiT, Swin Transformer, and ViT-B/16) against seven attack methods at five perturbation budgets, measuring ASR, Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index Measure (SSIM), and $L_2$ perturbation magnitude. Our findings show a consistent pattern: perceptual and distortion metrics are strongly associated with one another and exhibit minimal correlation with ASR. This applies to both CNNs and ViTs. The results demonstrate that ASR alone is an inadequate indicator of adversarial robustness and transferability. Consequently, we argue that a thorough assessment of adversarial risk in medical AI necessitates multi-metric frameworks that encompass not only the attack efficacy but also its methodology and associated overheads.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper conducts a systematic empirical study on adversarial attacks in medical imaging, evaluating Attack Success Rate (ASR) together with perceptual metrics (PSNR, SSIM) and distortion (L2 magnitude) across four datasets (PathMNIST, DermaMNIST, RetinaMNIST, CheXpert), seven models (VGG-16, ResNet-50, DenseNet-121, Inception-v3, DeiT, Swin Transformer, ViT-B/16), seven attack methods, and five perturbation budgets. It reports that perceptual and distortion metrics are strongly inter-correlated but show minimal correlation with ASR for both CNNs and ViTs, concluding that ASR alone is an inadequate indicator of adversarial robustness and transferability and that multi-metric frameworks are needed.
Significance. If the low-correlation pattern is robustly quantified, the work would usefully demonstrate that binary ASR overlooks important aspects of attack impact such as image quality degradation and cross-architecture transfer in medical settings. The multi-architecture (CNN + ViT) and multi-dataset design provides timely observational evidence at a moment when transformers are entering medical imaging pipelines, potentially motivating more comprehensive robustness benchmarks that account for perturbation overhead.
major comments (2)
- [Abstract] Abstract: the assertion that perceptual/distortion metrics 'exhibit minimal correlation with ASR' is presented without any reported correlation coefficients, p-values, or description of the statistical procedure (e.g., Pearson/Spearman, multiple-comparison correction). Because this quantitative pattern is the sole empirical basis for declaring ASR inadequate, the absence of these details prevents assessment of whether the correlation is truly minimal or merely below an arbitrary threshold.
- [Abstract] Abstract and concluding discussion: the generalization that 'ASR alone is an inadequate indicator of adversarial robustness and transferability' in medical AI rests on four 2D classification datasets and the listed model/attack combinations. No sensitivity analysis, additional modalities (3D volumes, ultrasound), or argument for representativeness is supplied; if the observed decoupling is specific to these choices rather than general, the call for multi-metric frameworks does not necessarily extend to the broader clinical domain.
minor comments (1)
- [Abstract] Abstract: the phrase 'associated with one another' for PSNR/SSIM/L2 should be accompanied by the actual inter-metric correlation values in the results section so readers can judge the strength of the 'strong' association.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on statistical reporting and scope. We address each major comment below and will incorporate revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the assertion that perceptual/distortion metrics 'exhibit minimal correlation with ASR' is presented without any reported correlation coefficients, p-values, or description of the statistical procedure (e.g., Pearson/Spearman, multiple-comparison correction). Because this quantitative pattern is the sole empirical basis for declaring ASR inadequate, the absence of these details prevents assessment of whether the correlation is truly minimal or merely below an arbitrary threshold.
Authors: We agree that the abstract should include quantitative details to support the claim. Our analysis computed Pearson and Spearman correlations between ASR and the perceptual/distortion metrics (PSNR, SSIM, L2) across all experiments, with multiple-comparison correction. We will revise the abstract to report the correlation coefficients, p-values, and a concise description of the procedure (Pearson/Spearman with Bonferroni correction), allowing readers to evaluate the minimal-correlation finding directly. revision: yes
-
Referee: [Abstract] Abstract and concluding discussion: the generalization that 'ASR alone is an inadequate indicator of adversarial robustness and transferability' in medical AI rests on four 2D classification datasets and the listed model/attack combinations. No sensitivity analysis, additional modalities (3D volumes, ultrasound), or argument for representativeness is supplied; if the observed decoupling is specific to these choices rather than general, the call for multi-metric frameworks does not necessarily extend to the broader clinical domain.
Authors: We acknowledge the scope is limited to 2D classification on four standard benchmark datasets (PathMNIST, DermaMNIST, RetinaMNIST, CheXpert) chosen for their coverage of common medical imaging tasks and modalities. The consistent low correlation pattern holds across both CNN and ViT architectures. We will add an explicit limitations paragraph in the discussion that argues for representativeness within 2D medical imaging, notes the absence of sensitivity analyses on 3D or ultrasound data, and recommends future multi-metric studies on additional modalities to test broader generalizability. revision: partial
Circularity Check
No circularity: purely observational empirical measurements
full rationale
The paper conducts direct experiments across four datasets, seven models, seven attacks, and five budgets, then reports measured correlations between ASR and perceptual/distortion metrics (PSNR, SSIM, L2). No equations, derivations, fitted parameters presented as predictions, or self-citations are used to support the central claim. The findings are observational patterns from the collected data, with no reduction of results to inputs by construction. This is a standard empirical study whose conclusions rest on the representativeness of the chosen setups rather than any circular logic.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Adversarial attacks are generated under standard Lp-norm constraints at fixed perturbation budgets.
Forward citations
Cited by 1 Pith paper
-
Single-Configuration Attack Success Rate Is Not Enough: Jailbreak Evaluations Should Report Distributional Attack Success
Jailbreak evaluations must report distributional statistics such as Variant Sensitivity Measure and Union Coverage across parameter variants rather than single best-case attack success rates.
Reference graph
Works this paper leans on
-
[1]
Gradient-based learning applied to document recognition,
Y . LeCun, L. Bottou, Y . Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,”Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 2002
2002
-
[2]
Imagenet classifica- tion with deep convolutional neural networks,
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classifica- tion with deep convolutional neural networks,”Advances in neural information processing systems, vol. 25, 2012
2012
-
[3]
Improving robustness to model inversion attacks via sparse coding architectures,
S. V . Dibbo, A. Breuer, J. Moore, and M. Teti, “Improving robustness to model inversion attacks via sparse coding architectures,” inEuro- pean Conference on Computer Vision, pp. 117–136, Springer, 2024
2024
-
[4]
D. Amebley and S. Dibbo, “Are neuro-inspired multi-modal vision- language models resilient to membership inference privacy leakage?,” arXiv preprint arXiv:2511.20710, 2025
-
[5]
Explaining vulnerabilities of heart rate biometric models securing iot wearables,
C.-W. Lien, S. Vhaduri, S. V . Dibbo, and M. Shaheed, “Explaining vulnerabilities of heart rate biometric models securing iot wearables,” Machine Learning with Applications, vol. 16, p. 100559, 2024
2024
-
[6]
Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,
G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V . Vanhoucke, P. Nguyen, T. N. Sainath,et al., “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,”IEEE Signal processing magazine, vol. 29, no. 6, pp. 82–97, 2012
2012
-
[7]
Predicting next call duration: A future direction to promote mental health in the age of lockdown,
S. Vhaduri, S. V . Dibbo, C.-Y . Chen, and C. Poellabauer, “Predicting next call duration: A future direction to promote mental health in the age of lockdown,” in2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC), pp. 804–811, IEEE, 2021
2021
-
[8]
Globally normalized transition-based neural networks,
D. Andor, C. Alberti, D. Weiss, A. Severyn, A. Presta, K. Ganchev, S. Petrov, and M. Collins, “Globally normalized transition-based neural networks,” inProceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2442–2452, 2016
2016
-
[9]
Bag of on-phone anns to secure iot objects using wearable and smartphone biometrics,
S. Vhaduri, W. Cheung, and S. V . Dibbo, “Bag of on-phone anns to secure iot objects using wearable and smartphone biometrics,”IEEE Transactions on Dependable and Secure Computing, vol. 21, no. 3, pp. 1127–1138, 2023
2023
-
[10]
Adversarial transferability in wear- able sensor systems,
R. K. Sah and H. Ghasemzadeh, “Adversarial transferability in wear- able sensor systems,”arXiv preprint arXiv:2003.07982, 2020
-
[11]
Scalable extraction of training data from aligned, production language models,
M. Nasr, J. Rando, N. Carlini, J. Hayase, M. Jagielski, A. F. Cooper, D. Ippolito, C. A. Choquette-Choo, F. Tram `er, and K. Lee, “Scalable extraction of training data from aligned, production language models,” inThe Thirteenth International Conference on Learning Representa- tions, 2025
2025
-
[12]
Intriguing properties of neural networks
C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfel- low, and R. Fergus, “Intriguing properties of neural networks,”arXiv preprint arXiv:1312.6199, 2013
work page internal anchor Pith review arXiv 2013
-
[13]
Delving into Transferable Adversarial Examples and Black-box Attacks
Y . Liu, X. Chen, C. Liu, and D. Song, “Delving into transfer- able adversarial examples and black-box attacks,”arXiv preprint arXiv:1611.02770, 2016
work page Pith review arXiv 2016
-
[14]
{DeBackdoor}: A deductive framework for detecting backdoor at- tacks on deep models with limited data,
D. Popovic, A. Sadeghi, T. Yu, S. Chawla, and I. Khalil, “{DeBackdoor}: A deductive framework for detecting backdoor at- tacks on deep models with limited data,” in34th USENIX Security Symposium (USENIX Security 25), pp. 6419–6438, 2025
2025
-
[15]
Image quality assessment through fsim, ssim, mse and psnr—a comparative study,
U. Sara, M. Akter, M. S. Uddin,et al., “Image quality assessment through fsim, ssim, mse and psnr—a comparative study,”Journal of Computer and Communications, vol. 7, no. 3, pp. 8–18, 2019
2019
-
[16]
Fast image reconstruction with l2- regularization,
B. Bilgic, I. Chatnuntawech, A. P. Fan, K. Setsompop, S. F. Cauley, L. L. Wald, and E. Adalsteinsson, “Fast image reconstruction with l2- regularization,”Journal of magnetic resonance imaging, vol. 40, no. 1, pp. 181–191, 2014
2014
-
[17]
Pearson correlation coefficient,
J. Benesty, J. Chen, Y . Huang, and I. Cohen, “Pearson correlation coefficient,” inNoise reduction in speech processing, pp. 1–4, Springer, 2009
2009
-
[18]
Spearman’s rank correlation coefficient,
P. Sedgwick, “Spearman’s rank correlation coefficient,”Bmj, vol. 349, 2014
2014
-
[19]
Jatmo: Prompt injection defense by task-specific fine- tuning,
J. Piet, M. Alrashed, C. Sitawarin, S. Chen, Z. Wei, E. Sun, B. Alomair, and D. Wagner, “Jatmo: Prompt injection defense by task-specific fine- tuning,” inEuropean Symposium on Research in Computer Security, pp. 105–124, Springer, 2024
2024
-
[20]
Meta-research on backdoors: Dataset and threat model shifts in multimodal backdoor attacks,
L. D. Peellawalage, S. Dibbo, and S. Vhaduri, “Meta-research on backdoors: Dataset and threat model shifts in multimodal backdoor attacks,” 2026
2026
-
[21]
Explaining and Harnessing Adversarial Examples
I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harness- ing adversarial examples,”arXiv preprint arXiv:1412.6572, 2014
work page internal anchor Pith review arXiv 2014
-
[22]
Adversarial Machine Learn- ing at Scale,
A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial machine learning at scale,”arXiv preprint arXiv:1611.01236, 2016
-
[23]
Towards Deep Learning Models Resistant to Adversarial Attacks
A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,”arXiv preprint arXiv:1706.06083, 2017
work page internal anchor Pith review arXiv 2017
-
[24]
Boosting adversarial attacks with momentum,
Y . Dong, F. Liao, T. Pang, H. Su, J. Zhu, X. Hu, and J. Li, “Boosting adversarial attacks with momentum,” inProceedings of the IEEE conference on computer vision and pattern recognition, pp. 9185– 9193, 2018
2018
-
[25]
Enhancing the transferability of adversarial attacks through variance tuning,
X. Wang and K. He, “Enhancing the transferability of adversarial attacks through variance tuning,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1924– 1933, 2021
1924
-
[26]
Improving transferability of adversarial examples with input diversity,
C. Xie, Z. Zhang, Y . Zhou, S. Bai, J. Wang, Z. Ren, and A. L. Yuille, “Improving transferability of adversarial examples with input diversity,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2730–2739, 2019
2019
-
[27]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly,et al., “An image is worth 16x16 words: Transformers for image recognition at scale,”arXiv preprint arXiv:2010.11929, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[28]
Medmnist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification,
J. Yang, R. Shi, D. Wei, Z. Liu, L. Zhao, B. Ke, H. Pfister, and B. Ni, “Medmnist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification,”Scientific data, vol. 10, no. 1, p. 41, 2023
2023
-
[29]
Predicting survival from colorectal cancer histology slides using deep learning: A retrospective multicenter study,
J. N. Kather, J. Krisam, P. Charoentong, T. Luedde, E. Herpel, C.-A. Weis, T. Gaiser, A. Marx, N. A. Valous, D. Ferber,et al., “Predicting survival from colorectal cancer histology slides using deep learning: A retrospective multicenter study,”PLoS medicine, vol. 16, no. 1, p. e1002730, 2019
2019
-
[30]
The ham10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions,
P. Tschandl, C. Rosendahl, and H. Kittler, “The ham10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions,”Scientific data, vol. 5, no. 1, p. 180161, 2018
2018
-
[31]
Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison,
J. Irvin, P. Rajpurkar, M. Ko, Y . Yu, S. Ciurea-Ilcus, C. Chute, H. Marklund, B. Haghgoo, R. Ball, K. Shpanskaya,et al., “Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison,” inProceedings of the AAAI conference on artificial intelligence, vol. 33, pp. 590–597, 2019
2019
-
[32]
Understanding adversarial attacks on deep learning based medical image analysis systems,
X. Ma, Y . Niu, L. Gu, Y . Wang, Y . Zhao, J. Bailey, and F. Lu, “Understanding adversarial attacks on deep learning based medical image analysis systems,”Pattern Recognition, vol. 110, p. 107332, 2021
2021
-
[33]
Adversarial attacks on medical machine learning,
S. G. Finlayson, J. D. Bowers, J. Ito, J. L. Zittrain, A. L. Beam, and I. S. Kohane, “Adversarial attacks on medical machine learning,” Science, vol. 363, no. 6433, pp. 1287–1289, 2019
2019
-
[34]
Scope of validity of psnr in image/video quality assessment,
Q. Huynh-Thu and M. Ghanbari, “Scope of validity of psnr in image/video quality assessment,”Electronics letters, vol. 44, no. 13, pp. 800–801, 2008
2008
-
[35]
Image quality assessment: from error visibility to structural similarity,
Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,”IEEE transactions on image processing, vol. 13, no. 4, pp. 600–612, 2004
2004
-
[36]
Towards evaluating the robustness of neural networks,
N. Carlini and D. Wagner, “Towards evaluating the robustness of neural networks,” in2017 ieee symposium on security and privacy (sp), pp. 39–57, Ieee, 2017
2017
-
[37]
C. Laidlaw, S. Singla, and S. Feizi, “Perceptual adversarial ro- bustness: Defense against unseen threat models,”arXiv preprint arXiv:2006.12655, 2020
-
[38]
Robustbench: a standardized adversarial robustness benchmark
F. Croce, M. Andriushchenko, V . Sehwag, E. Debenedetti, N. Flam- marion, M. Chiang, P. Mittal, and M. Hein, “Robustbench: a standardized adversarial robustness benchmark,”arXiv preprint arXiv:2010.09670, 2020
-
[39]
Very Deep Convolutional Networks for Large-Scale Image Recognition
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,”arXiv preprint arXiv:1409.1556, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[40]
Deep residual learning for image recognition,
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inProceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016
2016
-
[41]
Densely connected convolutional networks,
G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” inProceedings of the IEEE con- ference on computer vision and pattern recognition, pp. 4700–4708, 2017
2017
-
[42]
Rethink- ing the inception architecture for computer vision,
C. Szegedy, V . Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethink- ing the inception architecture for computer vision,” inProceedings of the IEEE conference on computer vision and pattern recognition, pp. 2818–2826, 2016
2016
-
[43]
Training data-efficient image transformers & distillation through attention,
H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. J ´egou, “Training data-efficient image transformers & distillation through attention,” inInternational conference on machine learning, pp. 10347–10357, PMLR, 2021
2021
-
[44]
Swin transformer: Hierarchical vision transformer using shifted windows,
Z. Liu, Y . Lin, Y . Cao, H. Hu, Y . Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” inProceedings of the IEEE/CVF international conference on computer vision, pp. 10012–10022, 2021
2021
-
[45]
Lcanets++: Robust audio classification using multi-layer neural networks with lateral competition,
S. V . Dibbo, J. S. Moore, G. T. Kenyon, and M. A. Teti, “Lcanets++: Robust audio classification using multi-layer neural networks with lateral competition,” in2024 IEEE International Conference on Acous- tics, Speech, and Signal Processing Workshops (ICASSPW), pp. 129– 133, IEEE, 2024
2024
-
[46]
Fast gradient sign method (fgsm) variants in white box settings: A comparative study,
A. Lad, R. Bhale, and S. Belgamwar, “Fast gradient sign method (fgsm) variants in white box settings: A comparative study,” in 2024 International Conference on Inventive Computation Technologies (ICICT), pp. 382–386, IEEE, 2024
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.