arxiv: 2604.22552 · v1 · submitted 2026-04-24 · 💻 cs.CV

Recognition: unknown

Transferable Physical-World Adversarial Patches Against Pedestrian Detection Models

Minghui Li, Shengshan Hu, Shihui Yan, Yifan Hu, Yufei Song, Ziqi Zhou

Authors on Pith no claims yet

Pith reviewed 2026-05-08 12:23 UTC · model grok-4.3

classification 💻 cs.CV

keywords adversarial patchpedestrian detectionphysical world attacksmulti-stage attacktransferable patchtriplet loss

0 comments

The pith

TriPatch generates physical adversarial patches that attack multiple stages of pedestrian detection with added robustness.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a method called TriPatch to create patches that can be printed and used to make pedestrian detectors fail in the real world. Existing patches often only affect one part of the detection process, allowing other parts to still work, and they do not hold up well when the scene changes with light or angle. TriPatch fixes this by using a loss that hits confidence, box position, and the final selection step all at once, while keeping the patch's appearance stable and training with varied conditions. A reader would care because these systems are used in cars and cameras, so better attacks reveal how to make them safer.

Core claim

The central discovery is that a triplet loss targeting detection confidence suppression, bounding-box offset amplification, and NMS disruption, combined with appearance consistency loss and data augmentation, enables generation of more effective and robust physical adversarial patches that achieve higher attack success rates across multiple detector models.

What carries the argument

The triplet loss that jointly disrupts confidence scores, bounding box predictions, and non-maximum suppression in the detection pipeline, supported by consistency constraints and physical augmentations.

If this is right

The attacks become more transferable to different pedestrian detection models.
The patches maintain effectiveness despite physical variations in the environment.
Residual modules in detectors are less able to compensate for the perturbations.
Overall attack success rates increase compared to previous single-stage or non-augmented methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Security systems relying on pedestrian detection may require new countermeasures that protect all pipeline stages simultaneously.
Similar approaches could apply to other detection tasks like vehicle or object recognition in autonomous systems.
Further work could test the patches' performance on specific hardware setups used in real deployments.

Load-bearing premise

The simulated effects of the triplet loss and augmentations will carry over to actual physical patches without a large drop in performance due to real-world factors like printing errors or sensor differences.

What would settle it

Printing the generated patches and attaching them to real pedestrians, then recording whether multiple detection models miss them under varied outdoor lighting, distances, and angles, and comparing the miss rates to those from earlier patch designs.

Figures

Figures reproduced from arXiv: 2604.22552 by Minghui Li, Shengshan Hu, Shihui Yan, Yifan Hu, Yufei Song, Ziqi Zhou.

**Figure 1.** Figure 1: Overview of adversarial examples on object detec view at source ↗

**Figure 2.** Figure 2: The pipeline of TriPatch. generalization, as well as query-efficient and zero-shot generation methods that require limited or no access to the target model. Despite their effectiveness in controlled digital environments, these methods typically assume direct pixel-level manipulation and full gradient accessibility, leaving open challenges when confronting more constrained and structured attack scenarios. … view at source ↗

**Figure 2.** Figure 2: We optimize on the adversarial image 𝑥 𝑎𝑑𝑣 a total objective composed of three primary losses and one appearance regularizer: 𝐿total = 𝜆det𝐿det + 𝜆iou𝐿iou + 𝜆nms𝐿nms + 𝜆app𝐿app, (2) where 𝜆det, 𝜆iou, 𝜆nms, and 𝜆app are weighting coefficients that balance the contribution of each loss component. These hyperparameters allow fine-tuning the attack strategy based on the specific detector architecture and dep… view at source ↗

**Figure 3.** Figure 3: Digital-World Adversarial Attack Results of Our Method. view at source ↗

**Figure 4.** Figure 4: Physical attack demo of TriPatch view at source ↗

**Figure 5.** Figure 5: The results (%) of ablation study. (a) - (d) investigate the effect of different epochs, patch sizes, modules and loss view at source ↗

**Figure 7.** Figure 7: Performance consistency across multiple random view at source ↗

read the original abstract

Physical adversarial patch attacks critically threaten pedestrian detection, causing surveillance and autonomous driving systems to miss pedestrians and creating severe safety risks. Despite their effectiveness in controlled settings, existing physical attacks face two major limitations in practice: they lack systematic disruption of the multi-stage decision pipeline, enabling residual modules to offset perturbations, and they fail to model complex physical variations, leading to poor robustness. To overcome these limitations, we propose a novel pedestrian adversarial patch generation method that combines multi-stage collaborative attacks with robustness enhancement under physical diversity, called TriPatch. Specifically, we design a triplet loss consisting of detection confidence suppression, bounding-box offset amplification, and non-maximum suppression (NMS) disruption, which jointly act across different stages of the detection pipeline. In addition, we introduce an appearance consistency loss to constrain the color distribution of the patch, thereby improving its adaptability under diverse imaging conditions, and incorporate data augmentation to further enhance robustness against complex physical perturbations. Extensive experiments demonstrate that TriPatch achieves a higher attack success rate across multiple detector models compared to existing approaches.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

TriPatch combines a triplet loss targeting confidence, box offsets, and NMS with an appearance consistency term for physical patches, but the abstract gives no numbers to judge the gains.

read the letter

TriPatch's main move is a triplet loss that hits three stages at once: it suppresses detection confidence, pushes bounding box predictions off, and breaks NMS, while an appearance consistency loss keeps the patch's colors stable and data augmentations add physical variation like lighting and viewpoint changes. This is a concrete step past single-loss patches that often leave later detector stages to recover. The motivation lines up with how real detectors work, and the physical robustness additions address a frequent weak point in patch papers where simulated attacks fail outdoors. If the full experiments hold, the multi-target design could be a useful template for testing other detectors. The paper does a solid job spelling out the pipeline stages and why each loss term matters, without overclaiming novelty beyond the specific combination. The evaluation scope across multiple models also fits the safety angle for driving and surveillance. The soft spot is that the abstract only asserts higher attack success rates without showing the percentages, the baselines it beats, or any ablation on the loss parts. That leaves the superiority claim uncheckable from the summary alone, and the sim-to-real gap still needs strong physical test results to be convincing. Even with the augmentations, patches on real objects often lose effectiveness faster than the controlled setup predicts. This is for people working on adversarial robustness in object detection. A reader focused on physical attacks or autonomous system safety would get value from the loss structure and the stated experimental range. It deserves peer review because the method is explicit, the problem is real, and the claims are falsifiable once the numbers appear.

Referee Report

0 major / 3 minor

Summary. The manuscript proposes TriPatch, a method for generating transferable physical-world adversarial patches against pedestrian detection models. It introduces a triplet loss that jointly suppresses detection confidence, amplifies bounding-box offsets, and disrupts NMS across the multi-stage detection pipeline, combined with an appearance consistency loss to constrain patch color distribution and data augmentations to improve robustness under physical variations such as lighting and viewpoint changes. The central claim is that this design yields higher attack success rates (ASR) across multiple detector models than existing physical patch attacks.

Significance. If the empirical results hold under rigorous evaluation, the work is significant for safety-critical applications in autonomous driving and surveillance, as it directly targets residual compensation mechanisms in modern detectors and models physical diversity more explicitly than prior confidence-only attacks. The multi-component loss provides a principled way to attack the full pipeline rather than isolated stages, which could guide both attack and defense research.

minor comments (3)

[Abstract] Abstract: the claim of 'higher attack success rate across multiple detector models' is stated without any quantitative values, baselines, or ablation highlights; adding one or two key ASR numbers and model names would strengthen the summary without exceeding length limits.
[Section 3] Section 3 (method): the triplet loss is described in terms of its three goals but the precise weighting or combination formula is not shown; an explicit equation would clarify how the components interact during optimization.
[Evaluation] Evaluation: while multiple models and physical perturbations are mentioned, the manuscript would benefit from reporting standard deviations over repeated physical trials and a clear ablation isolating the contribution of the NMS-disruption term.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the careful summary of our TriPatch method and for the positive assessment of its significance for safety-critical applications. We appreciate the recommendation for minor revision and will ensure the final version incorporates any clarifications needed to further strengthen the presentation of the multi-stage triplet loss, appearance consistency, and physical augmentation components.

Circularity Check

0 steps flagged

No significant circularity; empirical method with direct loss definitions

full rationale

The paper proposes TriPatch as an empirical adversarial patch method. It defines a triplet loss directly targeting detector pipeline stages (confidence suppression, bounding-box offset amplification, NMS disruption) plus an appearance consistency loss and physical data augmentations. No equations, predictions, or first-principles derivations are present that reduce by construction to fitted inputs or self-referential definitions. Central claims rest on experimental validation across multiple detectors and physical perturbations, with no load-bearing self-citations or uniqueness theorems invoked. This is a standard self-contained empirical proposal.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The method rests on standard assumptions that detector stages are differentiable and that simulated augmentations approximate physical variations; no new entities or fitted constants are introduced in the abstract.

axioms (2)

domain assumption Detection pipeline stages (confidence, bounding box, NMS) can be independently targeted by loss terms.
Invoked when defining the triplet loss components.
domain assumption Appearance consistency and data augmentation suffice to bridge simulation-to-real gap for patches.
Central to the robustness enhancement claim.

pith-pipeline@v0.9.0 · 5490 in / 1181 out tokens · 26904 ms · 2026-05-08T12:23:35.383016+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

65 extracted references · 13 canonical work pages · 5 internal anchors

[1]

Towards Reliable Forgetting: A Survey on Machine Unlearning Verification

Lulu Xue, Shengshan Hu, Wei Lu, Yan Shen, Dongxu Li, Peijin Guo, Ziqi Zhou, Minghui Li, Yanjun Zhang, and Leo Yu Zhang. Towards reliable forgetting: A survey on machine unlearning verification.arXiv preprint arXiv:2506.15115, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[2]

Ufvideo: Towards unified fine-grained video cooperative understanding with large language models

Hewen Pan, Cong Wei, Dashuang Liang, Zepeng Huang, Pengfei Gao, Ziqi Zhou, Lulu Xue, Pengfei Yan, Xiaoming Wei, Minghui Li, et al. Ufvideo: Towards unified fine-grained video cooperative understanding with large language models. In Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR’26), 2026

2026
[3]

Tattoo: Training-free aesthetic- aware outfit recommendation,

Yuntian Wu, Xiaonan Hu, Ziqi Zhou, and Hao Lu. Tattoo: Training-free aesthetic- aware outfit recommendation.arXiv preprint arXiv:2509.23242, 2025

work page arXiv 2025
[4]

Darkhash: A data-free backdoor attack against deep hashing.IEEE Transactions on Information Forensics and Security, 2025

Ziqi Zhou, Menghao Deng, Yufei Song, Hangtao Zhang, Wei Wan, Shengshan Hu, Minghui Li, Leo Yu Zhang, and Dezhong Yao. Darkhash: A data-free backdoor attack against deep hashing.IEEE Transactions on Information Forensics and Security, 2025

2025
[5]

Badhash: Invisible backdoor attacks against deep hashing with clean label

Shengshan Hu, Ziqi Zhou, Yechao Zhang, Leo Yu Zhang, Yifeng Zheng, Yuanyuan He, and Hai Jin. Badhash: Invisible backdoor attacks against deep hashing with clean label. InProceedings of the 30th ACM International Conference on Multimedia (ACM MM’22), pages 678–686, 2022

2022
[6]

Mars: A malignity-aware backdoor defense in federated learning

Wei Wan, Yuxuan Ning, Zhicong Huang, Cheng Hong, Shengshan Hu, Ziqi Zhou, Yechao Zhang, Tianqing Zhu, Wanlei Zhou, and Leo Yu Zhang. Mars: A malignity-aware backdoor defense in federated learning. InProceedings of the 39th Annual Conference on Neural Information Processing Systems (NeurIPS’25), 2025

2025
[7]

Detector collapse: Backdooring object detection to catastrophic overload or blindness

Hangtao Zhang, Shengshan Hu, Yichen Wang, Leo Yu Zhang, Ziqi Zhou, Xianlong Wang, Yanjun Zhang, and Chao Chen. Detector collapse: Backdooring object detection to catastrophic overload or blindness. InProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (IJCAI’24), 2024

2024
[8]

Test-time backdoor detection for object detection models

Hangtao Zhang, Yichen Wang, Shihui Yan, Chenyu Zhu, Ziqi Zhou, Linshan Hou, Shengshan Hu, Minghui Li, Yanjun Zhang, and Leo Yu Zhang. Test-time backdoor detection for object detection models. InProceedings of the Computer Vision and Pattern Recognition Conference (CVPR’25), pages 24377–24386, 2025

2025
[9]

Trojanrobot: Backdoor attacks against robotic manipulation in the physical world.arXiv e-prints, pages arXiv–2411, 2024

Xianlong Wang, Hewen Pan, Hangtao Zhang, Minghui Li, Shengshan Hu, Ziqi Zhou, Lulu Xue, Peijin Guo, Yichen Wang, Wei Wan, et al. Trojanrobot: Backdoor attacks against robotic manipulation in the physical world.arXiv e-prints, pages arXiv–2411, 2024

2024
[10]

Detecting and corrupting convolution-based unlearn- able examples

Minghui Li, Xianlong Wang, Zhifei Yu, Shengshan Hu, Ziqi Zhou, Longling Zhang, and Leo Yu Zhang. Detecting and corrupting convolution-based unlearn- able examples. InProceedings of the AAAI Conference on Artificial Intelligence (AAAI’25), volume 39, pages 18403–18411, 2025

2025
[11]

Eclipse: Expunging clean-label indiscriminate poisons via sparse diffusion purification

Xianlong Wang, Shengshan Hu, Yechao Zhang, Ziqi Zhou, Leo Yu Zhang, Peng Xu, Wei Wan, and Hai Jin. Eclipse: Expunging clean-label indiscriminate poisons via sparse diffusion purification. InEuropean Symposium on Research in Computer Security, pages 146–166. Springer, 2024

2024
[12]

Spa-vlm: Stealthy poisoning attacks on rag-based vlm,

Lei Yu, Yechao Zhang, Ziqi Zhou, Yang Wu, Wei Wan, Minghui Li, Shengshan Hu, Pei Xiaobing, and Jing Wang. Spa-vlm: Stealthy poisoning attacks on rag-based vlm.arXiv preprint arXiv:2505.23828, 2025

work page arXiv 2025
[13]

Unlearnable 3d point clouds: Class-wise transfor- mation is all you need

Xianlong Wang, Minghui Li, Wei Liu, Hangtao Zhang, Shengshan Hu, Yechao Zhang, Ziqi Zhou, and Hai Jin. Unlearnable 3d point clouds: Class-wise transfor- mation is all you need. InProceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS’24), volume 37, pages 99404–99432, 2024

2024
[14]

Nicholas Carlini and David A. Wagner. Towards evaluating the robustness of neural networks.CoRR, abs/1608.04644, 2016

work page arXiv 2016
[15]

Advedm: Fine- grained adversarial attack against vlm-based embodied agents

Yichen Wang, hangtao Zhang, Pan Hewen, Ziqi Zhou, Xianlong Wang, Peijin Guo, lulu Xue, Shengshan Hu, Minghui Li, Leo Yu Zhang, and Yao. Advedm: Fine- grained adversarial attack against vlm-based embodied agents. InProceedings of the 39th Annual Conference on Neural Information Processing Systems (NeurIPS’25), 2025

2025
[16]

Breaking barriers in physical-world adversarial examples: Improving robustness and transferability via robust feature

Yichen Wang, Yuxuan Chou, Ziqi Zhou, Hangtao Zhang, Wei Wan, Shengshan Hu, and Minghui Li. Breaking barriers in physical-world adversarial examples: Improving robustness and transferability via robust feature. InProceedings of the 39th Annual AAAI Conference on Artificial Intelligence (AAAI’25), 2025

2025
[17]

Pb-uap: Hybrid universal adversarial attack for image segmentation

Yufei Song, Ziqi Zhou, Minghui Li, Xianlong Wang, Menghao Deng, Wei Wan, Shengshan Hu, and Leo Yu Zhang. Pb-uap: Hybrid universal adversarial attack for image segmentation. InProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’25), 2025

2025
[18]

Dap: A dynamic adversarial patch for evading person detectors

Amira Guesmi, Ruitian Ding, Muhammad Abdullah Hanif, Ihsen Alouani, and Muhammad Shafique. Dap: A dynamic adversarial patch for evading person detectors. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 24595–24604. IEEE, 2024

2024
[19]

Physically realizable natural-looking clothing textures evade person detectors via 3d modeling

Zhanhao Hu, Wenda Chu, Xiaopei Zhu, Hui Zhang, Bo Zhang, and Xiaolin Hu. Physically realizable natural-looking clothing textures evade person detectors via 3d modeling. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16975–16984. IEEE, 2023

2023
[20]

Transferable adversarial facial images for privacy protection

Minghui Li, Jiangxiong Wang, Hao Zhang, Ziqi Zhou, Shengshan Hu, and Xi- aobing Pei. Transferable adversarial facial images for privacy protection. In Proceedings of the 32nd ACM International Conference on Multimedia (ACM MM’24), pages 10649–10658, 2024

2024
[21]

Segtrans: Transferable adversarial examples for segmentation models.IEEE Transactions on Multimedia, 2025

Yufei Song, Ziqi Zhou, Qi Lu, Hangtao Zhang, Yifan Hu, Lulu Xue, Shengshan Hu, Minghui Li, and Leo Yu Zhang. Segtrans: Transferable adversarial examples for segmentation models.IEEE Transactions on Multimedia, 2025

2025
[22]

Erosion attack for adversarial training to enhance semantic segmentation robustness

Yufei Song, Ziqi Zhou, Menghao Deng, Yifan Hu, Shengshan Hu, Minghui Li, and Leo Yu Zhang. Erosion attack for adversarial training to enhance semantic segmentation robustness. InProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’26), 2026

2026
[23]

Advedm: Fine- grained adversarial attack against vlm-based embodied agents

Yichen Wang, Hangtao Zhang, Hewen Pan, Ziqi Zhou, Xianlong Wang, Peijin Guo, Lulu Xue, Shengshan Hu, Minghui Li, and Leo Yu Zhang. Advedm: Fine- grained adversarial attack against vlm-based embodied agents. InProceedings of the 39th Annual Conference on Neural Information Processing Systems (NeurIPS’25), 2025

2025
[24]

Fooling automated surveil- lance cameras: Adversarial patches to attack person detection

Simen Thys, Wiebe Van Ranst, and Toon Goedemé. Fooling automated surveil- lance cameras: Adversarial patches to attack person detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 0–0, 2019

2019
[25]

Dual attention suppression attack: Generate adversarial camouflage in physical world

Jiakai Wang, Aishan Liu, Zixin Yin, Shunchang Liu, Shiyu Tang, and Xianglong Liu. Dual attention suppression attack: Generate adversarial camouflage in physical world. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8565–8574, 2021

2021
[26]

Adversarial t-shirt! evading person detectors in a physical world

Kaidi Xu, Gaoyuan Zhang, Sijia Liu, Quanfu Fan, Mengshu Sun, Hongge Chen, Pin-Yu Chen, Yanzhi Wang, and Xue Lin. Adversarial t-shirt! evading person detectors in a physical world. InEuropean Conference on Computer Vision (ECCV), pages 665–681. Springer, 2020

2020
[27]

On physical adversarial patches for object detection

Mark Lee and Zico Kolter. On physical adversarial patches for object detection. arXiv preprint arXiv:1906.11897, 2019

work page arXiv 1906
[28]

Rich feature hierarchies for accurate object detection and semantic segmentation

Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. InProceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 580–587, 2014

2014
[29]

Fast r-cnn

Ross Girshick. Fast r-cnn. InProceedings of the IEEE International Conference on Computer Vision (ICCV), pages 1440–1448. IEEE, 2015

2015
[30]

Girshick, and Jian Sun

Shaoqing Ren, Kaiming He, Ross B. Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection with region proposal networks.CoRR, abs/1506.01497, 2015

work page arXiv 2015
[31]

You only look once: Unified, real-time object detection

Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. You only look once: Unified, real-time object detection. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 779–788, Las Vegas, NV, USA, 2016. IEEE

2016
[32]

YOLOv3: An Incremental Improvement

Joseph Redmon and Ali Farhadi. Yolov3: An incremental improvement.arXiv preprint arXiv:1804.02767, 2018

work page internal anchor Pith review arXiv 2018
[33]

YOLOv4: Optimal Speed and Accuracy of Object Detection

Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao. Yolov4: Op- timal speed and accuracy of object detection.arXiv preprint arXiv:2004.10934, 2020. Conference acronym ’XX, June 03–05, 2018, Woodstock, NY

work page internal anchor Pith review arXiv 2004
[34]

YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detec- tors

Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detec- tors. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7464–7475, Vancouver, BC, Canada, 2023. IEEE

2023
[35]

Ssd: Single shot multibox detector

Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander Berg. Ssd: Single shot multibox detector. In European Conference on Computer Vision (ECCV), volume 9905, pages 21–37. Springer, 2016

2016
[36]

Focal loss for dense object detection

Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. Focal loss for dense object detection. InProceedings of the IEEE International Conference on Computer Vision (ICCV), pages 2980–2988, Venice, Italy, 2017. IEEE

2017
[37]

Securely fine-tuning pre-trained encoders against adversarial examples

Ziqi Zhou, Minghui Li, Wei Liu, Shengshan Hu, Yechao Zhang, Wei Wan, Lulu Xue, Leo Yu Zhang, Dezhong Yao, and Hai Jin. Securely fine-tuning pre-trained encoders against adversarial examples. InProceedings of the 2024 IEEE Symposium on Security and Privacy (SP’24), 2024

2024
[38]

Advclip: Downstream-agnostic adversarial examples in multimodal contrastive learning

Ziqi Zhou, Shengshan Hu, Minghui Li, Hangtao Zhang, Yechao Zhang, and Hai Jin. Advclip: Downstream-agnostic adversarial examples in multimodal contrastive learning. InProceedings of the 32nd ACM International Conference on Multimedia (MM’23), pages 6311–6320, 2023

2023
[39]

Downstream-agnostic adversarial examples

Ziqi Zhou, Shengshan Hu, Ruizhi Zhao, Qian Wang, Leo Yu Zhang, Junhui Hou, and Hai Jin. Downstream-agnostic adversarial examples. InProceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV’23), pages 4345–4355, 2023

2023
[40]

Numbod: A spatial-frequency fusion attack against object detectors

Zihan Zhou, Bo Li, Yifan Song, et al. Numbod: A spatial-frequency fusion attack against object detectors. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 1201–1209, 2025

2025
[41]

Darksam: Fooling segment anything model to segment nothing

Ziqi Zhou, Yufei Song, Minghui Li, Shengshan Hu, Xianlong Wang, Leo Yu Zhang, Dezhong Yao, and Hai Jin. Darksam: Fooling segment anything model to segment nothing. InProceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS’24), 2024

2024
[42]

Vanish into thin air: Cross-prompt universal adversarial attacks for sam2

Ziqi Zhou, Yifan Hu, Yufei Song, Zijing Li, Shengshan Hu, Leo Yu Zhang, Dezhong Yao, Long Zheng, and Hai Jin. Vanish into thin air: Cross-prompt universal adversarial attacks for sam2. InProceedings of the 39th Annual Conference on Neural Information Processing Systems (NeurIPS’25), 2025

2025
[43]

Towards robust rain removal against adversarial attacks: A comprehensive benchmark analysis and beyond

Yi Yu, Wenhan Yang, Yap-Peng Tan, and Alex C Kot. Towards robust rain removal against adversarial attacks: A comprehensive benchmark analysis and beyond. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’22), 2022

2022
[44]

Benchmarking adversarial robustness of image shadow removal with shadow-adaptive attacks

Chong Wang, Yi Yu, Lanqing Guo, and Bihan Wen. Benchmarking adversarial robustness of image shadow removal with shadow-adaptive attacks. InIEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP‘24), 2024

2024
[45]

Transferable adversarial attacks on sam and its downstream models

Song Xia, Wenhan Yang, Yi Yu, Xun Lin, Henghui Ding, LINGYU DUAN, and Xudong Jiang. Transferable adversarial attacks on sam and its downstream models. InProceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS’24), 2024

2024
[46]

From pretrain to pain: Adversarial vulnerability of video foundation models without task knowledge

Hui Lu, Yi Yu, Song Xia, Yiming Yang, Deepu Rajan, Boon Poh Ng, Alex Kot, and Xudong Jiang. From pretrain to pain: Adversarial vulnerability of video foundation models without task knowledge. InProceedings of the AAAI Conference on Artificial Intelligence (AAAI’26), 2026

2026
[47]

Intriguing properties of neural networks

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks.arXiv preprint arXiv:1312.6199, 2013

work page internal anchor Pith review arXiv 2013
[48]

Explaining and harness- ing adversarial examples.International Conference on Learning Representations (ICLR), 2015

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harness- ing adversarial examples.International Conference on Learning Representations (ICLR), 2015

2015
[49]

Towards Deep Learning Models Resistant to Adversarial Attacks

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017

work page internal anchor Pith review arXiv 2017
[50]

Camou: Learning physical vehicle camouflages to adversarially attack detectors in the wild

Yang Zhang, Hassan Foroosh, Philip David, and Boqing Gong. Camou: Learning physical vehicle camouflages to adversarially attack detectors in the wild. In International Conference on Learning Representations (ICLR), 2019

2019
[51]

Synthesizing robust adversarial examples

Anish Athalye, Logan Engstrom, Andrew Ilyas, and Kevin Kwok. Synthesizing robust adversarial examples. InInternational Conference on Machine Learning (ICML), pages 284–293. PMLR, 2018

2018
[52]

Uv-attack: Physical- world adversarial attacks for person detection via dynamic-nerf-based uv map- ping

Yanjie Li, Wenxuan Zhang, Kaisheng Liang, and Bin Xiao. Uv-attack: Physical- world adversarial attacks for person detection via dynamic-nerf-based uv map- ping. InInternational Conference on Learning Representations (ICLR), 2025. Poster

2025
[53]

https://doi.org/10.48550/arXiv.1712.09665

Tom B Brown, Dandelion Mané, Aurko Roy, Martín Abadi, and Justin Gilmer. Adversarial patch.arXiv preprint arXiv:1712.09665, 2017

work page arXiv 2017
[54]

Naturalistic physical adversarial patch for object detectors

Yu-Chih-Tuan Hu, Bo-Han Kung, Daniel Stanley Tan, JunCheng Chen, Kai-Lung Hua, and Wen-Huang Cheng. Naturalistic physical adversarial patch for object detectors. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 7848–7857, 2021

2021
[55]

Adversarial camouflage: Hiding physical-world attacks with natural styles

Ranjie Duan, Xingjun Ma, Yisen Wang, James Bailey, A Kai Qin, and Yun Yang. Adversarial camouflage: Hiding physical-world attacks with natural styles. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1000–1008, 2020

2020
[56]

Revisiting adversarial patches for designed camera-agnostic attacks against person detection.Advances in Neural Information Processing Systems (NeurIPS), 37:8047–8064, 2024

Haoxuan Wei, Zhicong Wang, Kai Zhang, Jinghui Hou, Yifei Liu, Haotian Tang, and Zhen Wang. Revisiting adversarial patches for designed camera-agnostic attacks against person detection.Advances in Neural Information Processing Systems (NeurIPS), 37:8047–8064, 2024

2024
[57]

Doepatch: Dy- namically optimized ensemble model for adversarial patches generation.IEEE Transactions on Information Forensics and Security, 19:9039–9054, 2024

Wenbin Tan, Yifan Li, Cheng Zhao, Xin Chen, and Lei Wang. Doepatch: Dy- namically optimized ensemble model for adversarial patches generation.IEEE Transactions on Information Forensics and Security, 19:9039–9054, 2024

2024
[58]

Histograms of oriented gradients for human detection

Navneet Dalal and Bill Triggs. Histograms of oriented gradients for human detection. In2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), volume 1, pages 886–893. IEEE, 2005

2005
[59]

Lawrence Zitnick

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. Microsoft coco: Common objects in context. InProceedings of the European Conference on Computer Vision (ECCV), pages 740–755. Springer, 2014

2014
[60]

Yolo9000: Better, faster, stronger.CoRR, abs/1612.08242, 2016

Joseph Redmon and Ali Farhadi. Yolo9000: Better, faster, stronger.CoRR, abs/1612.08242, 2016

work page arXiv 2016
[61]

ultralytics/yolov5

Glenn Jocher, Alex Stoken, Jirka Borovec, NanoCode012, ChristopherSTAN, Liu Changyu, Laughing, tkianai, Adam Hogan, lorenzomammana, yxNONG, AlexWang1900, Laurentiu Diaconu, Marc, wanghaoyang0106, ml5ah, Doug, Fran- cisco Ingham, Frederik, Guilhen, Hatovix, Jake Poznanski, Jiacong Fang, Lijun Yu, changyu98, Mingyu Wang, Naman Gupta, Osama Akhtar, Petr Dvor...
[62]

Glenn Jocher, Ayush Chaurasia, and Jing Qiu. Yolov8. https://github.com/ ultralytics/ultralytics, 2023. Version 8.0.0, GitHub repository, Ultralytics

2023
[63]

Yolov9: Learning what you want to learn using programmable gradient information

Chien-Yao Wang, I-Hau Yeh, and Hong-Yuan Mark Liao. Yolov9: Learning what you want to learn using programmable gradient information. InProceedings of the European Conference on Computer Vision (ECCV), Cham, Switzerland, 2024. Springer Nature Switzerland

2024
[64]

T-sea: Transfer-based self-ensemble attack on object detection

Haotian Huang, Zhi Chen, Hao Chen, et al. T-sea: Transfer-based self-ensemble attack on object detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 20514–20523. IEEE, 2023

2023
[65]

Yuille, and Ning Liu

Lifeng Huang, Chengying Gao, Yuyin Zhou, Changqing Zou, Cihang Xie, Alan L. Yuille, and Ning Liu. Upc: Learning universal physical camouflage attacks on object detectors.CoRR, abs/1909.04326, 2019

work page arXiv 1909