Visual Adversarial Attack on Vision-Language Models for Autonomous Driving

Aishan Liu, Boyi Jia, Lu Wang, Qiang Fu, Shengshan Hu, Siyuan Liang, Tianyuan Zhang, Xianglong Liu, Xinwei Zhang, Yitong Zhang

Authors on Pith no claims yet

classification 💻 cs.CV

keywords adversarialattacksvlmsattackdrivingmodelsvisualadvlm

0 comments

read the original abstract

Vision-language models (VLMs) have significantly advanced autonomous driving (AD) by enhancing reasoning capabilities. However, these models remain highly vulnerable to adversarial attacks. While existing research has primarily focused on general VLM attacks, the development of attacks tailored to the safety-critical AD context has been largely overlooked. In this paper, we take the first step toward designing adversarial attacks specifically targeting VLMs in AD, exposing the substantial risks these attacks pose within this critical domain. We identify two unique challenges for effective adversarial attacks on AD VLMs: the variability of textual instructions and the time-series nature of visual scenarios. To this end, we propose ADvLM, the first visual adversarial attack framework specifically designed for VLMs in AD. Our framework introduces Semantic-Invariant Induction, which uses a large language model to create a diverse prompt library of textual instructions with consistent semantic content, guided by semantic entropy. Building on this, we introduce Scenario-Associated Enhancement, an approach where attention mechanisms select key frames and perspectives within driving scenarios to optimize adversarial perturbations that generalize across the entire scenario. Extensive experiments on several AD VLMs over multiple benchmarks show that ADvLM achieves state-of-the-art attack effectiveness. Moreover, real-world attack studies further validate its applicability and potential in practice.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Challenging Vision-Language Models with Physically Deployable Multimodal Semantic Lighting Attacks
cs.CV 2026-04 unverdicted novelty 8.0

MSLA is the first physically deployable attack that uses adversarial lighting to break semantic alignment in VLMs such as CLIP, LLaVA, and BLIP, causing classification failures and hallucinations in real scenes.
GuardAD: Safeguarding Autonomous Driving MLLMs via Markovian Safety Logic
cs.AI 2026-05 unverdicted novelty 6.0

GuardAD reduces accident rates by 32% in autonomous driving MLLMs by using n-th order Markovian logic to infer latent hazards and revise actions.
Adversarial Flow Matching for Imperceptible Attacks on End-to-End Autonomous Driving
cs.CV 2026-04 unverdicted novelty 5.0

AFM is a novel gray-box adversarial attack using flow matching to create visually imperceptible perturbations that degrade performance of Vision-Language-Action and modular end-to-end autonomous driving models while s...
A Patch-based Cross-view Regularized Framework for Backdoor Defense in Multimodal Large Language Models
cs.CV 2026-04 unverdicted novelty 5.0

A patch-augmented cross-view regularization method reduces backdoor attack success rates in multimodal LLMs by enforcing output differences between original and perturbed views while using entropy constraints to prese...