arxiv: 2604.16532 · v1 · submitted 2026-04-16 · 💻 cs.CV · cs.AI

Recognition: unknown

Beyond Attack Success Rate: A Multi-Metric Evaluation of Adversarial Transferability in Medical Imaging Models

Emily Curl , Kofi Ampomah , Md Erfan , Sayanton Dibbo

Authors on Pith no claims yet

Pith reviewed 2026-05-10 10:59 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords adversarial robustnessmedical imagingattack success rateperceptual metricsVision TransformerstransferabilityCNNsmulti-metric evaluation

0 comments

The pith

Attack success rate alone misses key image quality factors in adversarial evaluations of medical imaging models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper conducts a broad empirical comparison of seven models, including CNNs and Vision Transformers, on four medical imaging datasets using seven attack methods at varying perturbation levels. It measures not only whether attacks succeed but also how much they distort images according to peak signal-to-noise ratio, structural similarity, and L2 magnitude. Perceptual and distortion metrics track closely with one another yet show almost no relation to attack success rate, and this pattern holds across both convolutional and transformer architectures. The work concludes that relying on a single binary success indicator gives an incomplete view of how adversarial perturbations actually affect model behavior and transferability in medical settings.

Core claim

A systematic study across PathMNIST, DermaMNIST, RetinaMNIST, and CheXpert datasets demonstrates that perceptual and distortion metrics remain strongly associated with each other while exhibiting minimal correlation with attack success rate, and that this dissociation appears consistently for both CNN and Vision Transformer models when subjected to multiple attack methods at several perturbation budgets.

What carries the argument

Multi-metric evaluation that jointly tracks attack success rate together with PSNR, SSIM, and L2 perturbation magnitude to expose relationships among efficacy, visual quality, and transfer behavior.

If this is right

Adversarial robustness assessments in medical AI must incorporate image quality and overhead measures alongside success rate to avoid underestimating real risk.
The same dissociation between success rate and perceptual metrics holds for both convolutional networks and vision transformers.
Attack transferability studies gain accuracy when they report perturbation strength and resulting image fidelity in addition to binary success.
Clinical deployment decisions should draw on multi-metric profiles rather than success-rate rankings alone.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Attack methods that maximize success at low perceptual cost may transfer differently than those that simply achieve high success regardless of distortion.
The observed metric dissociation could be tested by checking whether models trained to resist low-distortion attacks also resist high-distortion ones.
Extending the same multi-metric protocol to natural-image benchmarks would reveal whether the pattern is specific to medical data or more general.

Load-bearing premise

The four chosen medical datasets, seven models, seven attack methods, and five perturbation budgets are representative enough to support the general claim that attack success rate alone is inadequate for evaluating adversarial transferability and robustness.

What would settle it

Finding a strong positive correlation between attack success rate and at least one perceptual or distortion metric when the same evaluation protocol is repeated on a previously untested medical imaging dataset or model architecture would contradict the reported pattern.

Figures

Figures reproduced from arXiv: 2604.16532 by Emily Curl, Kofi Ampomah, Md Erfan, Sayanton Dibbo.

**Figure 2.** Figure 2: Mean ASR by dataset, architecture, and attack setting. CheXpert [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Mean ASR by attack method. Ensemble has the smallest WB/BB [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Transferability matrices (mean ASR %). (a) CNN: structured, [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 6.** Figure 6: Pearson (lower triangle) / Spearman (upper triangle) correlation [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 7.** Figure 7: ASR vs. SSIM for (a) CNN and (b) ViT models, colored by dataset. [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗

read the original abstract

While deep learning systems are becoming increasingly prevalent in medical image analysis, their vulnerabilities to adversarial perturbations raise serious concerns for clinical deployment. These vulnerability evaluations largely rely on Attack Success Rate (ASR), a binary metric that indicates solely whether an attack is successful. However, the ASR metric does not account for other factors, such as perturbation strength, perceptual image quality, and cross-architecture attack transferability, and therefore, the interpretation is incomplete. This gap requires consideration, as complex, large-scale deep learning systems, including Vision Transformers (ViTs), are increasingly challenging the dominance of Convolutional Neural Networks (CNNs). These architectures learn differently, and it is unclear whether a single metric, e.g., ASR, can effectively capture adversarial behavior. To address this, we perform a systematic empirical study on four medical image datasets: PathMNIST, DermaMNIST, RetinaMNIST, and CheXpert. We evaluate seven models (VGG-16, ResNet-50, DenseNet-121, Inception-v3, DeiT, Swin Transformer, and ViT-B/16) against seven attack methods at five perturbation budgets, measuring ASR, Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index Measure (SSIM), and $L_2$ perturbation magnitude. Our findings show a consistent pattern: perceptual and distortion metrics are strongly associated with one another and exhibit minimal correlation with ASR. This applies to both CNNs and ViTs. The results demonstrate that ASR alone is an inadequate indicator of adversarial robustness and transferability. Consequently, we argue that a thorough assessment of adversarial risk in medical AI necessitates multi-metric frameworks that encompass not only the attack efficacy but also its methodology and associated overheads.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows low correlation between ASR and perceptual metrics like PSNR/SSIM across their medical imaging setups for both CNNs and ViTs, but the narrow set of 2D datasets makes broad claims about ASR inadequacy tentative.

read the letter

The key point is that their experiments find perceptual and distortion metrics cluster together while showing minimal correlation with Attack Success Rate, and this pattern appears in both CNNs and ViTs on the four datasets they tested. That observation is worth noting for anyone evaluating attacks in medical imaging because it suggests success rate alone can miss how visible or distorting a perturbation actually is at different budgets.

Referee Report

2 major / 1 minor

Summary. The paper conducts a systematic empirical study on adversarial attacks in medical imaging, evaluating Attack Success Rate (ASR) together with perceptual metrics (PSNR, SSIM) and distortion (L2 magnitude) across four datasets (PathMNIST, DermaMNIST, RetinaMNIST, CheXpert), seven models (VGG-16, ResNet-50, DenseNet-121, Inception-v3, DeiT, Swin Transformer, ViT-B/16), seven attack methods, and five perturbation budgets. It reports that perceptual and distortion metrics are strongly inter-correlated but show minimal correlation with ASR for both CNNs and ViTs, concluding that ASR alone is an inadequate indicator of adversarial robustness and transferability and that multi-metric frameworks are needed.

Significance. If the low-correlation pattern is robustly quantified, the work would usefully demonstrate that binary ASR overlooks important aspects of attack impact such as image quality degradation and cross-architecture transfer in medical settings. The multi-architecture (CNN + ViT) and multi-dataset design provides timely observational evidence at a moment when transformers are entering medical imaging pipelines, potentially motivating more comprehensive robustness benchmarks that account for perturbation overhead.

major comments (2)

[Abstract] Abstract: the assertion that perceptual/distortion metrics 'exhibit minimal correlation with ASR' is presented without any reported correlation coefficients, p-values, or description of the statistical procedure (e.g., Pearson/Spearman, multiple-comparison correction). Because this quantitative pattern is the sole empirical basis for declaring ASR inadequate, the absence of these details prevents assessment of whether the correlation is truly minimal or merely below an arbitrary threshold.
[Abstract] Abstract and concluding discussion: the generalization that 'ASR alone is an inadequate indicator of adversarial robustness and transferability' in medical AI rests on four 2D classification datasets and the listed model/attack combinations. No sensitivity analysis, additional modalities (3D volumes, ultrasound), or argument for representativeness is supplied; if the observed decoupling is specific to these choices rather than general, the call for multi-metric frameworks does not necessarily extend to the broader clinical domain.

minor comments (1)

[Abstract] Abstract: the phrase 'associated with one another' for PSNR/SSIM/L2 should be accompanied by the actual inter-metric correlation values in the results section so readers can judge the strength of the 'strong' association.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on statistical reporting and scope. We address each major comment below and will incorporate revisions to strengthen the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the assertion that perceptual/distortion metrics 'exhibit minimal correlation with ASR' is presented without any reported correlation coefficients, p-values, or description of the statistical procedure (e.g., Pearson/Spearman, multiple-comparison correction). Because this quantitative pattern is the sole empirical basis for declaring ASR inadequate, the absence of these details prevents assessment of whether the correlation is truly minimal or merely below an arbitrary threshold.

Authors: We agree that the abstract should include quantitative details to support the claim. Our analysis computed Pearson and Spearman correlations between ASR and the perceptual/distortion metrics (PSNR, SSIM, L2) across all experiments, with multiple-comparison correction. We will revise the abstract to report the correlation coefficients, p-values, and a concise description of the procedure (Pearson/Spearman with Bonferroni correction), allowing readers to evaluate the minimal-correlation finding directly. revision: yes
Referee: [Abstract] Abstract and concluding discussion: the generalization that 'ASR alone is an inadequate indicator of adversarial robustness and transferability' in medical AI rests on four 2D classification datasets and the listed model/attack combinations. No sensitivity analysis, additional modalities (3D volumes, ultrasound), or argument for representativeness is supplied; if the observed decoupling is specific to these choices rather than general, the call for multi-metric frameworks does not necessarily extend to the broader clinical domain.

Authors: We acknowledge the scope is limited to 2D classification on four standard benchmark datasets (PathMNIST, DermaMNIST, RetinaMNIST, CheXpert) chosen for their coverage of common medical imaging tasks and modalities. The consistent low correlation pattern holds across both CNN and ViT architectures. We will add an explicit limitations paragraph in the discussion that argues for representativeness within 2D medical imaging, notes the absence of sensitivity analyses on 3D or ultrasound data, and recommends future multi-metric studies on additional modalities to test broader generalizability. revision: partial

Circularity Check

0 steps flagged

No circularity: purely observational empirical measurements

full rationale

The paper conducts direct experiments across four datasets, seven models, seven attacks, and five budgets, then reports measured correlations between ASR and perceptual/distortion metrics (PSNR, SSIM, L2). No equations, derivations, fitted parameters presented as predictions, or self-citations are used to support the central claim. The findings are observational patterns from the collected data, with no reduction of results to inputs by construction. This is a standard empirical study whose conclusions rest on the representativeness of the chosen setups rather than any circular logic.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

This is a purely empirical study with no mathematical derivations. No free parameters are fitted to produce the central claim, no new axioms are introduced beyond standard assumptions in adversarial ML, and no new entities are postulated.

axioms (1)

domain assumption Adversarial attacks are generated under standard Lp-norm constraints at fixed perturbation budgets.
Implicit in the choice of five perturbation budgets and seven attack methods; standard in the field but not derived in the paper.

pith-pipeline@v0.9.0 · 5627 in / 1284 out tokens · 46350 ms · 2026-05-10T10:59:23.728502+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Single-Configuration Attack Success Rate Is Not Enough: Jailbreak Evaluations Should Report Distributional Attack Success
cs.CR 2026-05 accept novelty 7.0

Jailbreak evaluations must report distributional statistics such as Variant Sensitivity Measure and Union Coverage across parameter variants rather than single best-case attack success rates.

Reference graph

Works this paper leans on

46 extracted references · 11 canonical work pages · cited by 1 Pith paper · 5 internal anchors

[1]

Gradient-based learning applied to document recognition,

Y . LeCun, L. Bottou, Y . Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,”Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 2002

2002
[2]

Imagenet classifica- tion with deep convolutional neural networks,

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classifica- tion with deep convolutional neural networks,”Advances in neural information processing systems, vol. 25, 2012

2012
[3]

Improving robustness to model inversion attacks via sparse coding architectures,

S. V . Dibbo, A. Breuer, J. Moore, and M. Teti, “Improving robustness to model inversion attacks via sparse coding architectures,” inEuro- pean Conference on Computer Vision, pp. 117–136, Springer, 2024

2024
[4]

Are neuro-inspired multi-modal vision- language models resilient to membership inference privacy leakage?,

D. Amebley and S. Dibbo, “Are neuro-inspired multi-modal vision- language models resilient to membership inference privacy leakage?,” arXiv preprint arXiv:2511.20710, 2025

work page arXiv 2025
[5]

Explaining vulnerabilities of heart rate biometric models securing iot wearables,

C.-W. Lien, S. Vhaduri, S. V . Dibbo, and M. Shaheed, “Explaining vulnerabilities of heart rate biometric models securing iot wearables,” Machine Learning with Applications, vol. 16, p. 100559, 2024

2024
[6]

Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,

G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V . Vanhoucke, P. Nguyen, T. N. Sainath,et al., “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,”IEEE Signal processing magazine, vol. 29, no. 6, pp. 82–97, 2012

2012
[7]

Predicting next call duration: A future direction to promote mental health in the age of lockdown,

S. Vhaduri, S. V . Dibbo, C.-Y . Chen, and C. Poellabauer, “Predicting next call duration: A future direction to promote mental health in the age of lockdown,” in2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC), pp. 804–811, IEEE, 2021

2021
[8]

Globally normalized transition-based neural networks,

D. Andor, C. Alberti, D. Weiss, A. Severyn, A. Presta, K. Ganchev, S. Petrov, and M. Collins, “Globally normalized transition-based neural networks,” inProceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2442–2452, 2016

2016
[9]

Bag of on-phone anns to secure iot objects using wearable and smartphone biometrics,

S. Vhaduri, W. Cheung, and S. V . Dibbo, “Bag of on-phone anns to secure iot objects using wearable and smartphone biometrics,”IEEE Transactions on Dependable and Secure Computing, vol. 21, no. 3, pp. 1127–1138, 2023

2023
[10]

Adversarial transferability in wear- able sensor systems,

R. K. Sah and H. Ghasemzadeh, “Adversarial transferability in wear- able sensor systems,”arXiv preprint arXiv:2003.07982, 2020

work page arXiv 2003
[11]

Scalable extraction of training data from aligned, production language models,

M. Nasr, J. Rando, N. Carlini, J. Hayase, M. Jagielski, A. F. Cooper, D. Ippolito, C. A. Choquette-Choo, F. Tram `er, and K. Lee, “Scalable extraction of training data from aligned, production language models,” inThe Thirteenth International Conference on Learning Representa- tions, 2025

2025
[12]

Intriguing properties of neural networks

C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfel- low, and R. Fergus, “Intriguing properties of neural networks,”arXiv preprint arXiv:1312.6199, 2013

work page internal anchor Pith review arXiv 2013
[13]

Delving into Transferable Adversarial Examples and Black-box Attacks

Y . Liu, X. Chen, C. Liu, and D. Song, “Delving into transfer- able adversarial examples and black-box attacks,”arXiv preprint arXiv:1611.02770, 2016

work page Pith review arXiv 2016
[14]

{DeBackdoor}: A deductive framework for detecting backdoor at- tacks on deep models with limited data,

D. Popovic, A. Sadeghi, T. Yu, S. Chawla, and I. Khalil, “{DeBackdoor}: A deductive framework for detecting backdoor at- tacks on deep models with limited data,” in34th USENIX Security Symposium (USENIX Security 25), pp. 6419–6438, 2025

2025
[15]

Image quality assessment through fsim, ssim, mse and psnr—a comparative study,

U. Sara, M. Akter, M. S. Uddin,et al., “Image quality assessment through fsim, ssim, mse and psnr—a comparative study,”Journal of Computer and Communications, vol. 7, no. 3, pp. 8–18, 2019

2019
[16]

Fast image reconstruction with l2- regularization,

B. Bilgic, I. Chatnuntawech, A. P. Fan, K. Setsompop, S. F. Cauley, L. L. Wald, and E. Adalsteinsson, “Fast image reconstruction with l2- regularization,”Journal of magnetic resonance imaging, vol. 40, no. 1, pp. 181–191, 2014

2014
[17]

Pearson correlation coefficient,

J. Benesty, J. Chen, Y . Huang, and I. Cohen, “Pearson correlation coefficient,” inNoise reduction in speech processing, pp. 1–4, Springer, 2009

2009
[18]

Spearman’s rank correlation coefficient,

P. Sedgwick, “Spearman’s rank correlation coefficient,”Bmj, vol. 349, 2014

2014
[19]

Jatmo: Prompt injection defense by task-specific fine- tuning,

J. Piet, M. Alrashed, C. Sitawarin, S. Chen, Z. Wei, E. Sun, B. Alomair, and D. Wagner, “Jatmo: Prompt injection defense by task-specific fine- tuning,” inEuropean Symposium on Research in Computer Security, pp. 105–124, Springer, 2024

2024
[20]

Meta-research on backdoors: Dataset and threat model shifts in multimodal backdoor attacks,

L. D. Peellawalage, S. Dibbo, and S. Vhaduri, “Meta-research on backdoors: Dataset and threat model shifts in multimodal backdoor attacks,” 2026

2026
[21]

Explaining and Harnessing Adversarial Examples

I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harness- ing adversarial examples,”arXiv preprint arXiv:1412.6572, 2014

work page internal anchor Pith review arXiv 2014
[22]

Adversarial Machine Learn- ing at Scale,

A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial machine learning at scale,”arXiv preprint arXiv:1611.01236, 2016

work page arXiv 2016
[23]

Towards Deep Learning Models Resistant to Adversarial Attacks

A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,”arXiv preprint arXiv:1706.06083, 2017

work page internal anchor Pith review arXiv 2017
[24]

Boosting adversarial attacks with momentum,

Y . Dong, F. Liao, T. Pang, H. Su, J. Zhu, X. Hu, and J. Li, “Boosting adversarial attacks with momentum,” inProceedings of the IEEE conference on computer vision and pattern recognition, pp. 9185– 9193, 2018

2018
[25]

Enhancing the transferability of adversarial attacks through variance tuning,

X. Wang and K. He, “Enhancing the transferability of adversarial attacks through variance tuning,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1924– 1933, 2021

1924
[26]

Improving transferability of adversarial examples with input diversity,

C. Xie, Z. Zhang, Y . Zhou, S. Bai, J. Wang, Z. Ren, and A. L. Yuille, “Improving transferability of adversarial examples with input diversity,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2730–2739, 2019

2019
[27]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly,et al., “An image is worth 16x16 words: Transformers for image recognition at scale,”arXiv preprint arXiv:2010.11929, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010
[28]

Medmnist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification,

J. Yang, R. Shi, D. Wei, Z. Liu, L. Zhao, B. Ke, H. Pfister, and B. Ni, “Medmnist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification,”Scientific data, vol. 10, no. 1, p. 41, 2023

2023
[29]

Predicting survival from colorectal cancer histology slides using deep learning: A retrospective multicenter study,

J. N. Kather, J. Krisam, P. Charoentong, T. Luedde, E. Herpel, C.-A. Weis, T. Gaiser, A. Marx, N. A. Valous, D. Ferber,et al., “Predicting survival from colorectal cancer histology slides using deep learning: A retrospective multicenter study,”PLoS medicine, vol. 16, no. 1, p. e1002730, 2019

2019
[30]

The ham10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions,

P. Tschandl, C. Rosendahl, and H. Kittler, “The ham10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions,”Scientific data, vol. 5, no. 1, p. 180161, 2018

2018
[31]

Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison,

J. Irvin, P. Rajpurkar, M. Ko, Y . Yu, S. Ciurea-Ilcus, C. Chute, H. Marklund, B. Haghgoo, R. Ball, K. Shpanskaya,et al., “Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison,” inProceedings of the AAAI conference on artificial intelligence, vol. 33, pp. 590–597, 2019

2019
[32]

Understanding adversarial attacks on deep learning based medical image analysis systems,

X. Ma, Y . Niu, L. Gu, Y . Wang, Y . Zhao, J. Bailey, and F. Lu, “Understanding adversarial attacks on deep learning based medical image analysis systems,”Pattern Recognition, vol. 110, p. 107332, 2021

2021
[33]

Adversarial attacks on medical machine learning,

S. G. Finlayson, J. D. Bowers, J. Ito, J. L. Zittrain, A. L. Beam, and I. S. Kohane, “Adversarial attacks on medical machine learning,” Science, vol. 363, no. 6433, pp. 1287–1289, 2019

2019
[34]

Scope of validity of psnr in image/video quality assessment,

Q. Huynh-Thu and M. Ghanbari, “Scope of validity of psnr in image/video quality assessment,”Electronics letters, vol. 44, no. 13, pp. 800–801, 2008

2008
[35]

Image quality assessment: from error visibility to structural similarity,

Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,”IEEE transactions on image processing, vol. 13, no. 4, pp. 600–612, 2004

2004
[36]

Towards evaluating the robustness of neural networks,

N. Carlini and D. Wagner, “Towards evaluating the robustness of neural networks,” in2017 ieee symposium on security and privacy (sp), pp. 39–57, Ieee, 2017

2017
[37]

Perceptual

C. Laidlaw, S. Singla, and S. Feizi, “Perceptual adversarial ro- bustness: Defense against unseen threat models,”arXiv preprint arXiv:2006.12655, 2020

work page arXiv 2006
[38]

Robustbench: a standardized adversarial robustness benchmark

F. Croce, M. Andriushchenko, V . Sehwag, E. Debenedetti, N. Flam- marion, M. Chiang, P. Mittal, and M. Hein, “Robustbench: a standardized adversarial robustness benchmark,”arXiv preprint arXiv:2010.09670, 2020

work page arXiv 2010
[39]

Very Deep Convolutional Networks for Large-Scale Image Recognition

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,”arXiv preprint arXiv:1409.1556, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[40]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inProceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016

2016
[41]

Densely connected convolutional networks,

G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” inProceedings of the IEEE con- ference on computer vision and pattern recognition, pp. 4700–4708, 2017

2017
[42]

Rethink- ing the inception architecture for computer vision,

C. Szegedy, V . Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethink- ing the inception architecture for computer vision,” inProceedings of the IEEE conference on computer vision and pattern recognition, pp. 2818–2826, 2016

2016
[43]

Training data-efficient image transformers & distillation through attention,

H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. J ´egou, “Training data-efficient image transformers & distillation through attention,” inInternational conference on machine learning, pp. 10347–10357, PMLR, 2021

2021
[44]

Swin transformer: Hierarchical vision transformer using shifted windows,

Z. Liu, Y . Lin, Y . Cao, H. Hu, Y . Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” inProceedings of the IEEE/CVF international conference on computer vision, pp. 10012–10022, 2021

2021
[45]

Lcanets++: Robust audio classification using multi-layer neural networks with lateral competition,

S. V . Dibbo, J. S. Moore, G. T. Kenyon, and M. A. Teti, “Lcanets++: Robust audio classification using multi-layer neural networks with lateral competition,” in2024 IEEE International Conference on Acous- tics, Speech, and Signal Processing Workshops (ICASSPW), pp. 129– 133, IEEE, 2024

2024
[46]

Fast gradient sign method (fgsm) variants in white box settings: A comparative study,

A. Lad, R. Bhale, and S. Belgamwar, “Fast gradient sign method (fgsm) variants in white box settings: A comparative study,” in 2024 International Conference on Inventive Computation Technologies (ICICT), pp. 382–386, IEEE, 2024

2024