arxiv: 2605.04675 · v1 · submitted 2026-05-06 · 💻 cs.CV

Recognition: 2 theorem links

· Lean Theorem

Physical Adversarial Clothing Evades Visible-Thermal Detectors via Non-Overlapping RGB-T Pattern

Guanning Zeng, Jun Zhu, Xiaolin Hu, Xiaopei Zhu, Zhanhao Hu

Authors on Pith no claims yet

Pith reviewed 2026-05-08 17:46 UTC · model grok-4.3

classification 💻 cs.CV

keywords physical adversarial attacksRGB-T object detectionnon-overlapping patternsadversarial clothingmultimodal fusionvisible-thermal detectorstransferable attacks

0 comments

The pith

Adversarial clothing with non-overlapping visible and thermal patterns evades RGB-T detectors in both digital and physical settings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that clothing can be printed with separate visible and thermal adversarial patterns that do not overlap on the fabric. This design avoids the light reduction that occurs when patterns overlap and is optimized on 3D models of a person to cover all viewing angles. The approach uses a spatial discrete-continuous optimization process to generate the patterns and achieves high success rates against multiple RGB-T detectors that fuse the two modalities differently. A fusion-stage ensemble further allows the same clothing to transfer effectively to detectors it was not trained on. Readers would care because RGB-T detection supports safety-critical systems such as autonomous driving under low light.

Core claim

Non-overlapping RGB-T patterns on adversarial clothing, generated via spatial discrete-continuous optimization on full-view 3D models, produce high attack success rates on visible-thermal detectors across different fusion architectures in both digital and physical worlds, while a fusion-stage ensemble improves transferability to unseen detectors.

What carries the argument

The non-overlapping RGB-T pattern (NORP) that places distinct visible and thermal adversarial materials on separate regions of the clothing, optimized by spatial discrete-continuous optimization (SDCO) on 3D human and clothing models to enable full 360-degree attacks.

Load-bearing premise

The 3D RGB-T models and material simulations accurately capture real-world lighting, thermal emission, and sensor responses across all viewing angles.

What would settle it

A controlled physical test in which the printed adversarial clothing is worn by a moving person under varied outdoor lighting and angles, then the attack success rate is measured against the simulated rates on the same detectors.

Figures

Figures reproduced from arXiv: 2605.04675 by Guanning Zeng, Jun Zhu, Xiaolin Hu, Xiaopei Zhu, Zhanhao Hu.

**Figure 1.** Figure 1: Demonstration of physical attacks against RGB-T de view at source ↗

**Figure 2.** Figure 2: The overall pipeline of the proposed method. We jointly optimizes visible and thermal patterns on 3D RGB-T clothing models view at source ↗

**Figure 3.** Figure 3: Illustration of the SDCO method. In SRD, black pixels view at source ↗

**Figure 4.** Figure 4: ASRs for different RGB-T detectors at various (a) dis view at source ↗

**Figure 5.** Figure 5: Visualization of physical RGB-T attacks across diverse scenes. Top row: indoor scenarios. Bottom row: outdoor scenarios. O: view at source ↗

read the original abstract

Visible-thermal (RGB-T) object detection is a crucial technology for applications such as autonomous driving, where multimodal fusion enhances performance in challenging conditions like low light. However, the security of RGB-T detectors, particularly in the physical world, has been largely overlooked. This paper proposes a novel approach to RGB-T physical attacks using adversarial clothing with a non-overlapping RGB-T pattern (NORP). To simulate full-view (0$^{\circ}$--360$^{\circ}$) RGB-T attacks, we construct 3D RGB-T models for human and adversarial clothing. NORP is a new adversarial pattern design using distinct visible and thermal materials without overlap, avoiding the light reduction in overlapping RGB-T patterns (ORP). To optimize the NORP on adversarial clothing, we propose a spatial discrete-continuous optimization (SDCO) method. We systematically evaluated our method on RGB-T detectors with different fusion architectures, demonstrating high attack success rates both in the digital and physical worlds. Additionally, we introduce a fusion-stage ensemble method that enhances the transferability of adversarial attacks across unseen RGB-T detectors with different fusion architectures.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

NORP plus SDCO gives a clean way to do physical attacks on RGB-T detectors without the usual modality interference, but the physical claims rest on unvalidated 3D and material simulations.

read the letter

This paper's main contribution is the non-overlapping RGB-T pattern (NORP) for adversarial clothing, combined with a spatial discrete-continuous optimization (SDCO) to make it work across all viewing angles on 3D human models. It also adds a fusion-stage ensemble to improve transfer to unseen detectors. What the work does well is identify a practical limitation in prior adversarial clothing attacks—overlapping patterns reduce effectiveness in one modality—and propose a fix using distinct materials. The 3D modeling for full 360-degree coverage is a solid step toward realistic physical evaluation, and testing across different fusion architectures shows some thought about real-world detector variations. The results claim strong performance in both digital and physical settings, which would be useful if the numbers hold. The potential issue is the sim-to-real transfer. The physical attacks rely on how accurately the 3D models and material properties simulate actual thermal emission, lighting, and sensor responses. If the paper doesn't include direct validation like side-by-side real and simulated images or error metrics on temperature, then the physical success rates might not generalize as claimed. That assumption is load-bearing for the main result. Overall, this is aimed at people studying physical adversarial attacks on multimodal systems. It has enough of a new idea and relevant application that it should go to peer review, though the authors will probably need to add more on the simulation fidelity to make the physical claims convincing.

Referee Report

3 major / 3 minor

Summary. The paper proposes non-overlapping RGB-T adversarial patterns (NORP) printed on clothing to evade visible-thermal (RGB-T) object detectors. It constructs 3D RGB-T models of humans and clothing to enable full 0°–360° view simulation, introduces a spatial discrete-continuous optimization (SDCO) procedure to generate the patterns, reports high attack success rates (ASR) against RGB-T detectors with varied fusion architectures in both digital and physical settings, and adds a fusion-stage ensemble to boost transferability to unseen detectors.

Significance. If the physical-world transfer results hold under rigorous validation, the work would be significant for highlighting practical vulnerabilities in multimodal RGB-T detectors used in safety-critical settings such as autonomous driving. The NORP design directly addresses light-reduction problems of overlapping patterns, the 3D full-view modeling is a reasonable attempt to handle viewpoint variation, and the ensemble technique targets a known weakness in adversarial transfer. These elements could inform future defense research if supported by stronger empirical grounding.

major comments (3)

[Physical evaluation] Physical-world evaluation section: high ASR is claimed for the fabricated NORP clothing across viewing angles and fusion architectures, yet no quantitative sim-to-real validation metrics (temperature prediction error, emissivity calibration error, or RGB-T image similarity scores between rendered and real captures) are supplied. Without these, it is impossible to determine whether the reported physical success stems from accurate modeling or from unmodeled factors, directly undermining the central transfer claim.
[Method] SDCO optimization and 3D model construction: the method optimizes patterns on simulated 3D RGB-T meshes, but the manuscript provides no ablation on the impact of material property assumptions (e.g., thermal emissivity values or non-overlapping layer interactions) or on how sensor response functions are approximated. These modeling choices are load-bearing for the assertion that NORP outperforms ORP and generalizes across architectures.
[Evaluation] Transferability experiments: the fusion-stage ensemble is presented as improving ASR on unseen detectors, but the evaluation lacks explicit baseline comparisons (e.g., single-model attacks or standard ensemble methods) and reports no statistical significance or variance across the tested fusion architectures, weakening the transferability conclusion.

minor comments (3)

[Figures] Figure captions for the physical clothing results should explicitly list the detector fusion types, viewing angles, and environmental conditions under which each image was captured.
[Method] The distinction between NORP and ORP would benefit from a short equation or pseudocode block defining the non-overlap constraint and the resulting radiance model.
[Related Work] A small number of recent references on thermal adversarial attacks and multimodal fusion defenses are missing from the related-work section.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We sincerely thank the referee for the constructive and detailed feedback. We address each major comment point by point below, providing clarifications and committing to revisions that strengthen the empirical grounding of our claims without altering the core contributions.

read point-by-point responses

Referee: [Physical evaluation] Physical-world evaluation section: high ASR is claimed for the fabricated NORP clothing across viewing angles and fusion architectures, yet no quantitative sim-to-real validation metrics (temperature prediction error, emissivity calibration error, or RGB-T image similarity scores between rendered and real captures) are supplied. Without these, it is impossible to determine whether the reported physical success stems from accurate modeling or from unmodeled factors, directly undermining the central transfer claim.

Authors: We thank the referee for this important point. Our physical evaluations used real captures of the printed NORP clothing under controlled indoor and outdoor conditions matching the simulation viewpoints, yielding the reported ASRs. However, we did not include explicit quantitative sim-to-real metrics in the original manuscript. In the revised version we will add: (i) temperature prediction errors computed by comparing simulated thermal maps against contactless thermometer measurements on the fabric surface, (ii) details of emissivity calibration using known reference materials, and (iii) RGB-T image similarity scores (SSIM and LPIPS) between rendered and captured pairs. These additions will directly address the concern and support the modeling fidelity. revision: yes
Referee: [Method] SDCO optimization and 3D model construction: the method optimizes patterns on simulated 3D RGB-T meshes, but the manuscript provides no ablation on the impact of material property assumptions (e.g., thermal emissivity values or non-overlapping layer interactions) or on how sensor response functions are approximated. These modeling choices are load-bearing for the assertion that NORP outperforms ORP and generalizes across architectures.

Authors: We appreciate the referee drawing attention to the modeling assumptions. The 3D RGB-T meshes use literature-standard emissivity values (0.85 for clothing, 0.95 for skin) and approximate sensor responses via typical RGB and LWIR spectral sensitivity curves; non-overlapping layers are modeled by independent material assignment without cross-layer thermal interaction. While these choices are justified by prior work, we agree an ablation would be valuable. The revised manuscript will include a new ablation subsection (and supplementary figures) varying emissivity by ±0.1, testing alternative sensor response approximations, and measuring resulting ASR changes for both NORP and ORP. This will confirm robustness and strengthen the generalization claims. revision: yes
Referee: [Evaluation] Transferability experiments: the fusion-stage ensemble is presented as improving ASR on unseen detectors, but the evaluation lacks explicit baseline comparisons (e.g., single-model attacks or standard ensemble methods) and reports no statistical significance or variance across the tested fusion architectures, weakening the transferability conclusion.

Authors: We agree that the transferability evaluation can be made more rigorous. The current results show the fusion-stage ensemble achieving higher ASR on held-out detectors than the individual models used for optimization. To address the gaps, the revision will add: explicit comparisons against single-model attacks and both input-stage and decision-stage ensemble baselines; mean ASR with standard deviation across five independent optimization runs; and statistical significance testing (paired t-tests with p-values) across the different fusion architectures. These updates will provide clearer evidence for the ensemble's benefit. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation uses standard adversarial optimization and empirical evaluation on constructed models

full rationale

The paper's chain consists of constructing 3D RGB-T models, defining NORP as a non-overlapping pattern design, proposing SDCO for optimization, and reporting attack success rates on various fusion architectures in digital and physical settings. None of these steps reduce to self-definition, fitted inputs renamed as predictions, or load-bearing self-citations. The central claims rest on explicit simulation choices and experimental measurements rather than any quantity being equivalent to its inputs by construction. This is the expected non-finding for an empirical attack paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the work relies on standard 3D rendering and optimization techniques from prior adversarial ML literature.

pith-pipeline@v0.9.0 · 5507 in / 1021 out tokens · 30838 ms · 2026-05-08T17:46:56.091485+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

51 extracted references · 8 canonical work pages · 4 internal anchors

[1]

Cognitive data augmentation for adversarial defense via pixel masking.Pattern Recognition Letters, 146:244– 251, 2021

Akshay Agarwal, Mayank Vatsa, Richa Singh, and Nalini Ratha. Cognitive data augmentation for adversarial defense via pixel masking.Pattern Recognition Letters, 146:244– 251, 2021. 8

2021
[2]

Synthesizing robust adversarial examples

Anish Athalye, Logan Engstrom, Andrew Ilyas, and Kevin Kwok. Synthesizing robust adversarial examples. InPro- ceedings of the 35th International Conference on Machine Learning, ICML, 2018. 4

2018
[3]

Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation

Yoshua Bengio, Nicholas L ´eonard, and Aaron Courville. Estimating or propagating gradients through stochastic neurons for conditional computation.arXiv preprint arXiv:1308.3432, 2013. 7

work page internal anchor Pith review arXiv 2013
[4]

Multimodal object detection via probabilistic ensembling

Yi-Ting Chen, Jinghao Shi, Zelin Ye, Christoph Mertz, Deva Ramanan, and Shu Kong. Multimodal object detection via probabilistic ensembling. InEuropean Conference on Com- puter Vision, pages 139–158. Springer, 2022. 2, 6, 8

2022
[5]

Spatio-contextual deep network-based multimodal pedestrian detection for au- tonomous driving.IEEE transactions on intelligent trans- portation systems, 23(9):15940–15950, 2022

Kinjal Dasgupta, Arindam Das, Sudip Das, Ujjwal Bhat- tacharya, and Senthil Yogamani. Spatio-contextual deep network-based multimodal pedestrian detection for au- tonomous driving.IEEE transactions on intelligent trans- portation systems, 23(9):15940–15950, 2022. 1

2022
[6]

Fusion-mamba for cross-modality object detection.arXiv preprint arXiv:2404.09146, 2024

Wenhao Dong, Haodong Zhu, Shaohui Lin, Xiaoyan Luo, Yunhang Shen, Xuhui Liu, Juan Zhang, Guodong Guo, and Baochang Zhang. Fusion-mamba for cross-modality object detection.arXiv preprint arXiv:2404.09146, 2024. 2

work page arXiv 2024
[7]

Kai Qin, and Yun Yang

Ranjie Duan, Xingjun Ma, Yisen Wang, James Bailey, A. Kai Qin, and Yun Yang. Adversarial camouflage: Hiding physical-world attacks with natural styles. InCVPR, pages 997–1005, 2020. 2

2020
[8]

Enhanced thermal-rgb fusion for robust object detection

Wassim El Ahmar, Yahya Massoud, Dhanvin Kolhatkar, Hamzah AlGhamdi, Mohammad Alja’Afreh, Riad Ham- moud, and Robert Laganiere. Enhanced thermal-rgb fusion for robust object detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 365–374, 2023. 2

2023
[9]

Explaining and Harnessing Adversarial Examples

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples.arXiv preprint arXiv:1412.6572, 2014. 8

work page internal anchor Pith review arXiv 2014
[10]

Countering adversarial images using input transformations.arXiv preprint arXiv:1711.00117, 2017

Chuan Guo, Mayank Rana, Moustapha Cisse, and Laurens Van Der Maaten. Countering adversarial images using input transformations.arXiv preprint arXiv:1711.00117, 2017. 8

work page arXiv 2017
[11]

Damsdet: Dynamic adaptive multispectral detec- tion transformer with competitive query selection and adap- tive feature fusion

Junjie Guo, Chenqiang Gao, Fangcen Liu, Deyu Meng, and Xinbo Gao. Damsdet: Dynamic adaptive multispectral detec- tion transformer with competitive query selection and adap- tive feature fusion. InECCV 2024, pages 464–481. Springer,

2024
[12]

Dmffnet: Dual-mode multi-scale fea- ture fusion-based pedestrian detection method.IEEE Access,

Ruizhe Hu, Ting Rui, Yan Ouyang, Jinkang Wang, Qunyan Jiang, and Yinan Du. Dmffnet: Dual-mode multi-scale fea- ture fusion-based pedestrian detection method.IEEE Access,
[13]

Nat- uralistic physical adversarial patch for object detectors

Yu-Chih-Tuan Hu, Bo-Han Kung, Daniel Stanley Tan, Jun- Cheng Chen, Kai-Lung Hua, and Wen-Huang Cheng. Nat- uralistic physical adversarial patch for object detectors. In ICCV, 2021. 2

2021
[14]

Adversarial texture for fooling per- son detectors in the physical world

Zhanhao Hu, Siyuan Huang, Xiaopei Zhu, Fuchun Sun, Bo Zhang, and Xiaolin Hu. Adversarial texture for fooling per- son detectors in the physical world. InProceedings of the IEEE/CVF conference on computer vision and pattern recog- nition, pages 13307–13316, 2022

2022
[15]

Physically realizable natural-looking clothing textures evade person detectors via 3d modeling

Zhanhao Hu, Wenda Chu, Xiaopei Zhu, Hui Zhang, Bo Zhang, and Xiaolin Hu. Physically realizable natural-looking clothing textures evade person detectors via 3d modeling. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR), pages 16975–16984,
[16]

Categorical repa- rameterization with gumbel-softmax

Eric Jang, Shixiang Gu, and Ben Poole. Categorical repa- rameterization with gumbel-softmax. InICLR, 2017. 7

2017
[17]

Pad: Patch-agnostic defense against adversarial patch at- tacks

Lihua Jing, Rui Wang, Wenqi Ren, Xin Dong, and Cong Zou. Pad: Patch-agnostic defense against adversarial patch at- tacks. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, pages 24472–24481,
[18]

YOLOv11: An Overview of the Key Architectural Enhancements

Rahima Khanam and Muhammad Hussain. Yolov11: An overview of the key architectural enhancements.arXiv preprint arXiv:2410.17725, 2024. 2, 6, 8

work page internal anchor Pith review arXiv 2024
[19]

Map: Multispectral adversarial patch to attack person detection

Taeheon Kim, Hong Joo Lee, and Yong Man Ro. Map: Multispectral adversarial patch to attack person detection. InICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 4853–4857. IEEE, 2022. 2, 3, 6

2022
[20]

Multispec- tral invisible coating: laminated visible-thermal physical at- tack against multispectral object detectors using transparent low-e films

Taeheon Kim, Youngjoon Yu, and Yong Man Ro. Multispec- tral invisible coating: laminated visible-thermal physical at- tack against multispectral object detectors using transparent low-e films. InProceedings of the AAAI Conference on Arti- ficial Intelligence, pages 1151–1159, 2023. 2, 3, 6

2023
[21]

Regseg: An end-to-end network for multi- modal rgb-thermal registration and semantic segmentation

Wenjie Lai, Fanyu Zeng, Xiao Hu, Shaowei He, Ziji Liu, and Yadong Jiang. Regseg: An end-to-end network for multi- modal rgb-thermal registration and semantic segmentation. IEEE Transactions on Image Processing, 2024. 1

2024
[22]

Multispectral deep neural networks for pedestrian detection.arXiv preprint arXiv:1611.02644, 2016

Jingjing Liu, Shaoting Zhang, Shu Wang, and Dimitris N Metaxas. Multispectral deep neural networks for pedestrian detection.arXiv preprint arXiv:1611.02644, 2016. 2, 6, 8

work page arXiv 2016
[23]

Target-aware dual ad- versarial learning and a multi-scenario multi-modality bench- mark to fuse infrared and visible for object detection

Jinyuan Liu, Xin Fan, Zhanbo Huang, Guanyao Wu, Risheng Liu, Wei Zhong, and Zhongxuan Luo. Target-aware dual ad- versarial learning and a multi-scenario multi-modality bench- mark to fuse infrared and visible for object detection. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5802–5811, 2022. 2

2022
[24]

Improving autonomous vehicle cognitive robustness in extreme weather with deep learning and thermal camera fusion.IEEE Open Journal of Vehicular Technology, 6:426– 441, 2025

Mehmood Nawaz, Sheheryar Khan, Muhammad Daud, Muhammad Asim, Ghazanfar Ali Anwar, Ali Raza Shahid, Ho Pui Aaron HO, Tom Chan, Daniel Pak Kong, and Wu Yuan. Improving autonomous vehicle cognitive robustness in extreme weather with deep learning and thermal camera fusion.IEEE Open Journal of Vehicular Technology, 6:426– 441, 2025. 1

2025
[25]

Modality-independent regression and training for improving multispectral pedestrian detection

Han Ni, Wenna Wang, Shuai Yun, Zixu Zhao, and Xiuwei Zhang. Modality-independent regression and training for improving multispectral pedestrian detection. In2022 7th International Conference on Image, Vision and Computing (ICIVC), pages 75–80. IEEE, 2022. 2

2022
[26]

Icafusion: Iterative cross-attention guided feature fusion for multispectral object detection.Pattern Recognition, 145:109913, 2024

Jifeng Shen, Yifei Chen, Yue Liu, Xin Zuo, Heng Fan, and Wankou Yang. Icafusion: Iterative cross-attention guided feature fusion for multispectral object detection.Pattern Recognition, 145:109913, 2024. 2

2024
[27]

Jedi: Entropy-based localization and removal of adversarial patches

Bilel Tarchoun, Anouar Ben Khalifa, Mohamed Ali Mahjoub, Nael Abu-Ghazaleh, and Ihsen Alouani. Jedi: Entropy-based localization and removal of adversarial patches. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4087–4095,
[28]

Fool- ing automated surveillance cameras: Adversarial patches to attack person detection

Simen Thys, Wiebe Van Ranst, and Toon Goedem ´e. Fool- ing automated surveillance cameras: Adversarial patches to attack person detection. InIEEE Conference on Computer Vi- sion and Pattern Recognition Workshops, CVPR Workshops,
[29]

Multispectral pedestrian detection using deep fusion convolutional neural networks

J ¨org Wagner, V olker Fischer, Michael Herman, Sven Behnke, et al. Multispectral pedestrian detection using deep fusion convolutional neural networks. InESANN, pages 509–514,
[30]

Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

Chien-Yao Wang, Alexey Bochkovskiy, and Hong- Yuan Mark Liao. Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7464–7475, 2023. 2

2023
[31]

Yolov9: Learning what you want to learn using pro- grammable gradient information

Chien-Yao Wang, I-Hau Yeh, and Hong-Yuan Mark Liao. Yolov9: Learning what you want to learn using pro- grammable gradient information. InEuropean conference on computer vision, pages 1–21. Springer, 2024. 2

2024
[32]

Hotcold block: Fooling thermal infrared detectors with a novel wearable de- sign

Hui Wei, Zhixiang Wang, Xuemei Jia, Yinqiang Zheng, Hao Tang, Shin’ichi Satoh, and Zheng Wang. Hotcold block: Fooling thermal infrared detectors with a novel wearable de- sign. InProceedings of the AAAI conference on artificial intelligence, pages 15233–15241, 2023. 2

2023
[33]

Physical adversarial attack meets computer vision: A decade survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024

Hui Wei, Hao Tang, Xuemei Jia, Zhixiang Wang, Hanxun Yu, Zhubo Li, Shin’ichi Satoh, Luc Van Gool, and Zheng Wang. Physical adversarial attack meets computer vision: A decade survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024. 2

2024
[34]

Adversarial sticker: A stealthy attack method in the physical world.IEEE Trans- actions on Pattern Analysis and Machine Intelligence, 45(3): 2711–2725, 2022

Xingxing Wei, Ying Guo, and Jie Yu. Adversarial sticker: A stealthy attack method in the physical world.IEEE Trans- actions on Pattern Analysis and Machine Intelligence, 45(3): 2711–2725, 2022. 2, 6

2022
[35]

Unified adversarial patch for visible-infrared cross-modal attacks in the physical world.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023

Xingxing Wei, Yao Huang, Yitong Sun, and Jie Yu. Unified adversarial patch for visible-infrared cross-modal attacks in the physical world.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023. 2, 3, 6, 7

2023
[36]

Physically adversar- ial infrared patches with learnable shapes and locations

Xingxing Wei, Jie Yu, and Yao Huang. Physically adversar- ial infrared patches with learnable shapes and locations. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12334–12342, 2023. 2

2023
[37]

Napguard: Towards detecting naturalistic adver- sarial patches

Siyang Wu, Jiakai Wang, Jiejie Zhao, Yazhe Wang, and Xian- glong Liu. Napguard: Towards detecting naturalistic adver- sarial patches. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24367– 24376, 2024. 8

2024
[38]

Making an invisibility cloak: Real world adversarial attacks on object detectors

Zuxuan Wu, Ser-Nam Lim, Larry S Davis, and Tom Gold- stein. Making an invisibility cloak: Real world adversarial attacks on object detectors. InEuropean Conference on Com- puter Vision, 2020. 2

2020
[39]

Rgb-thermal based pedestrian detection with single-modal augmentation and roi pooling multiscale fusion

Jiajun Xiang, Shuiping Gou, Ruimin Li, and Zhihui Zheng. Rgb-thermal based pedestrian detection with single-modal augmentation and roi pooling multiscale fusion. InIGARSS 2022-2022 IEEE International Geoscience and Remote Sens- ing Symposium, pages 3532–3535. IEEE, 2022. 2

2022
[40]

Ms-detr: Multispectral pedestrian detection transformer with loosely coupled fusion and modality-balanced optimization.IEEE Transactions on Intelligent Transportation Systems, 2024

Yinghui Xing, Shuo Yang, Song Wang, Shizhou Zhang, Guo- qiang Liang, Xiuwei Zhang, and Yanning Zhang. Ms-detr: Multispectral pedestrian detection transformer with loosely coupled fusion and modality-balanced optimization.IEEE Transactions on Intelligent Transportation Systems, 2024. 2

2024
[41]

Adversarial t-shirt! evading person detectors in a physi- cal world

Kaidi Xu, Gaoyuan Zhang, Sijia Liu, Quanfu Fan, Mengshu Sun, Hongge Chen, Pin-Yu Chen, Yanzhi Wang, and Xue Lin. Adversarial t-shirt! evading person detectors in a physi- cal world. InECCV, 2020. 2

2020
[42]

Feature squeez- ing: Detecting adversarial examples in deep neural networks

Weilin Xu, David Evans, and Yanjun Qi. Feature squeez- ing: Detecting adversarial examples in deep neural networks. arXiv preprint arXiv:1704.01155, 2017. 8

work page arXiv 2017
[43]

Multispectral fusion for object detection with cyclic fuse-and-refine blocks

Heng Zhang, Elisa Fromont, S ´ebastien Lefevre, and Bruno Avignon. Multispectral fusion for object detection with cyclic fuse-and-refine blocks. In2020 IEEE International conference on image processing (ICIP), pages 276–280. IEEE, 2020. 5

2020
[44]

E2e-mfd: Towards end-to-end synchronous multimodal fusion detec- tion.Advances in Neural Information Processing Systems, 37:52296–52322, 2024

Jiaqing Zhang, Mingxiang Cao, Weiying Xie, Jie Lei, Daixun Li, Wenbo Huang, Yunsong Li, and Xue Yang. E2e-mfd: Towards end-to-end synchronous multimodal fusion detec- tion.Advances in Neural Information Processing Systems, 37:52296–52322, 2024. 2

2024
[45]

Weakly aligned cross-modal learn- ing for multispectral pedestrian detection

Lu Zhang, Xiangyu Zhu, Xiangyu Chen, Xu Yang, Zhen Lei, and Zhiyong Liu. Weakly aligned cross-modal learn- ing for multispectral pedestrian detection. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019. 2, 6, 8

2019
[46]

Metafusion: Infrared and visible image fusion via meta- feature embedding from object detection

Wenda Zhao, Shigeng Xie, Fan Zhao, You He, and Huchuan Lu. Metafusion: Infrared and visible image fusion via meta- feature embedding from object detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13955–13965, 2023. 2

2023
[47]

Deformable DETR: Deformable Transformers for End-to-End Object Detection

Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, and Jifeng Dai. Deformable detr: Deformable trans- formers for end-to-end object detection.arXiv preprint arXiv:2010.04159, 2020. 2, 6, 8

work page internal anchor Pith review arXiv 2010
[48]

Fooling thermal infrared pedestrian detectors in real world using small bulbs

Xiaopei Zhu, Xiao Li, Jianmin Li, Zheyao Wang, and Xiaolin Hu. Fooling thermal infrared pedestrian detectors in real world using small bulbs. InProceedings of the AAAI Con- ference on Artificial Intelligence, pages 3616–3624, 2021. 2, 3, 6

2021
[49]

Infrared invisible clothing: Hiding from infrared detectors at multiple angles in realworld

Xiaopei Zhu, Zhanhao Hu, Siyuan Huang, Jianmin Li, and Xiaolin Hu. Infrared invisible clothing: Hiding from infrared detectors at multiple angles in realworld. InCVPR, 2022. 2, 6

2022
[50]

Hiding from thermal imaging pedestrian detectors in the physical world.Neurocomputing, 564:126923, 2024

Xiaopei Zhu, Xiao Li, Jianmin Li, Zheyao Wang, and Xiaolin Hu. Hiding from thermal imaging pedestrian detectors in the physical world.Neurocomputing, 564:126923, 2024. 2, 3

2024
[51]

Infrared adversarial car stickers

Xiaopei Zhu, Yuqiu Liu, Zhanhao Hu, Jianmin Li, and Xi- aolin Hu. Infrared adversarial car stickers. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24284–24293, 2024. 2, 3, 6

2024