pith. sign in

arxiv: 2606.30807 · v1 · pith:RC2SFGVUnew · submitted 2026-06-29 · 💻 cs.RO · cs.CR· cs.CV

Off the Rails: Hijacking the Scoring Head in Generative End-to-End Driving Planners with Safety-Violating Adversarial Perturbations

Pith reviewed 2026-07-01 01:29 UTC · model grok-4.3

classification 💻 cs.RO cs.CRcs.CV
keywords adversarial attacksend-to-end autonomous drivinggenerative plannersscoring headtrajectory selectionsafety violationsdiffusion models
0
0 comments X

The pith

Adversarial perturbations can hijack the scoring head in generative autonomous driving planners to select unsafe trajectories instead of safe ones.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Generative end-to-end driving planners generate candidate trajectories and rely on a learned scoring head to pick the best one from BEV features. Derail crafts perturbations that exploit small margins in this scoring step to make unsafe candidates outrank safe ones. The attack produces score drops of 39 to 80 percent and collision rates reaching 50 percent across multiple planners. A reader would care because the scoring head forms the direct link from perception output to vehicle command in current designs.

Core claim

Derail shows that the shared inference pattern of scoring a fixed set of candidate trajectories with one or more learned heads allows safety-violating perturbations to flip selection from safe to unsafe candidates, with measured score reductions of 39-80 percent and collision rates up to 50 percent that exceed those of generic loss-maximization or feature-divergence attacks.

What carries the argument

The scoring head, a learned module conditioned on Bird's-Eye-View features that assigns scores to a fixed set of candidate trajectories and returns the highest-scored candidate as the executed trajectory.

Load-bearing premise

Decision margins between safe and unsafe trajectory candidates inside the scoring head are small enough for targeted input perturbations to reverse their ranking.

What would settle it

Apply Derail to an additional generative planner not used in the original evaluation and check whether score drops stay in the 39-80 percent range and collision rates reach near 50 percent.

Figures

Figures reproduced from arXiv: 2606.30807 by Halima Bouzidi, Haoyu Liu, Mboutidem Ekemini Mkpong, Mohammad Abdullah Al Faruque.

Figure 1
Figure 1. Figure 1: Overview of the DERAIL framework, targeting generative E2E AD systems. GTRS-Aug and GTRS-Dense it is a multi-head aggregate combining imitation with predictions of collision, drivable￾area compliance, time-to-collision, ego progress, and lane keeping, each implemented as a head on Z rather than a closed-form geometric check. (4) The scoring head selects the highest-scored candidate v ⋆ , τ = τ (v ⋆ ) . (5)… view at source ↗
Figure 2
Figure 2. Figure 2: Qualitative visualization of DERAIL digital attacks across six NAVSIM scenarios. Each scenario shows the BEV trajectory prediction and front-camera view under clean (left) and adversarially digital perturbed (right) conditions. Green trajectories indicate clean predictions; red trajectories indicate collision-inducing predictions under DERAIL. columns 1–2), Laggr drives the scoring head toward high￾speed f… view at source ↗
Figure 3
Figure 3. Figure 3: Qualitative visualization of DERAIL physical patch-based attacks across three NAVSIM scenarios. Each scenario shows the BEV trajectory prediction and front-camera view under clean (left) and adversarially perturbed via physical patch (right) conditions [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Gradient alignment with the collision objective across scene tokens (02 logs). [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
read the original abstract

Generative models have recently seen rapid adoption in End-to-End (E2E) autonomous driving (AD), with diffusion-based denoising and vocabulary-based retrieval becoming the dominant trajectory-decoding paradigms. Despite their architectural diversity, current generative AD planners share a common inference pattern: a fixed set of candidate trajectories (anchors, vocabulary entries, or proposal queries) is scored by one or more learned heads conditioned on the Bird's-Eye-View (BEV) features, and the highest-scored candidate is returned as the final trajectory. Under this design, the scoring head is the only barrier between perception and the motion command, and its decision margins between competing candidates are often small. We introduce \textsc{Derail}, an adversarial framework that exploits this scoring-head attack surface. Evaluated on various generative planners, \textsc{Derail} flips the trajectory selection from a safe to an unsafe candidate, with score drops of $39$--$80\%$ and collision rates of up to $50\%$, consistently outperforming generic loss-maximization and feature-divergence attacks. Our analysis suggests that safety-violating objectives govern attack effectiveness against generative AD planners, and that the scoring-head inference pattern itself is a recurring attack surface worth explicit defensive consideration.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript introduces Derail, an adversarial attack framework targeting the scoring head in generative end-to-end autonomous driving planners. It exploits the common inference pattern where candidate trajectories are scored and the highest is selected, claiming small decision margins allow flipping from safe to unsafe trajectories. The paper reports that Derail achieves score drops of 39--80% and collision rates up to 50% on various planners, outperforming generic loss-maximization and feature-divergence attacks, and suggests that safety-violating objectives are key to effectiveness.

Significance. If the empirical results hold, this work highlights a structural vulnerability in current generative AD planners and provides evidence that the scoring-head pattern is an attack surface. The cross-planner evaluation and comparison to baselines are positive aspects. It contributes to understanding attack effectiveness in this domain.

major comments (1)
  1. [Abstract] Abstract: the central claim that the scoring head's 'decision margins between competing candidates are often small' is the stated attack surface, yet the manuscript provides no quantification of pre-attack score gaps (or their distribution) between the highest-scoring safe candidate and the highest unsafe candidate on clean inputs for the evaluated planners. This is load-bearing, as success could be explained by the safety-violating objective alone rather than the claimed architectural pattern.
minor comments (1)
  1. The abstract reports quantitative results on score drops and collision rates but provides no details on experimental setup, number of trials, statistical tests, or exact baselines; these must be supplied in the main text (e.g., §4 or §5) for verification.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for highlighting this important point regarding the central claim in the abstract. We address the comment below and commit to strengthening the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that the scoring head's 'decision margins between competing candidates are often small' is the stated attack surface, yet the manuscript provides no quantification of pre-attack score gaps (or their distribution) between the highest-scoring safe candidate and the highest unsafe candidate on clean inputs for the evaluated planners. This is load-bearing, as success could be explained by the safety-violating objective alone rather than the claimed architectural pattern.

    Authors: We agree that explicit quantification of the pre-attack score gaps is necessary to substantiate the claim that small decision margins constitute the primary attack surface. The current manuscript does not provide this analysis or any distribution statistics on clean inputs. In the revised manuscript we will add a dedicated analysis (new figure and table) reporting the mean, median, and distribution of score differences between the top safe and top unsafe candidates across all evaluated planners and datasets. This will allow direct assessment of whether the observed attack success is attributable to the architectural pattern rather than the objective alone. revision: yes

Circularity Check

0 steps flagged

Empirical attack evaluation contains no derivation chain or self-referential steps

full rationale

The paper presents an adversarial attack method (Derail) and reports its empirical performance on existing generative planners via direct measurements of score drops, collision rates, and comparisons to baselines. No equations, fitted parameters renamed as predictions, uniqueness theorems, or ansatzes appear in the provided text. The claim that scoring-head margins are often small is stated as an observed architectural pattern rather than derived from prior results within the paper. All reported outcomes are external measurements on fixed models and therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that generative planners follow a fixed-candidate scoring pattern with small decision margins; no free parameters or new physical entities are introduced.

axioms (1)
  • domain assumption Generative AD planners share a common inference pattern: a fixed set of candidate trajectories is scored by one or more learned heads conditioned on BEV features.
    Explicitly stated in the abstract as the dominant trajectory-decoding paradigm across diffusion-based and vocabulary-based methods.

pith-pipeline@v0.9.1-grok · 5787 in / 1188 out tokens · 48463 ms · 2026-07-01T01:29:46.336646+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

72 extracted references · 18 canonical work pages · 7 internal anchors

  1. [1]

    Is conditional gen- erative modeling all you need for decision making? InThe Eleventh International Conference on Learning Representa- tions

    Anurag Ajay, Yilun Du, Abhi Gupta, Joshua B Tenenbaum, Tommi S Jaakkola, and Pulkit Agrawal. Is conditional gen- erative modeling all you need for decision making? InThe Eleventh International Conference on Learning Representa- tions. 1, 2

  2. [2]

    Recurrent conditional generative adversarial networks for autonomous driving sensor modelling

    Henrik Arnelid, Edvin Listo Zec, and Nasser Mohammadiha. Recurrent conditional generative adversarial networks for autonomous driving sensor modelling. In2019 IEEE Intelli- gent transportation systems conference (ITSC), pages 1613–

  3. [3]

    Synthesizing robust adversarial examples

    Anish Athalye, Logan Engstrom, Andrew Ilyas, and Kevin Kwok. Synthesizing robust adversarial examples. InInter- national conference on machine learning, pages 284–293. PMLR, 2018. 4

  4. [4]

    See no evil: Adversarial attacks against linguistic- visual association in referring multi-object tracking systems

    Halima Bouzidi, Haoyu Liu, and Mohammad Abdullah Al Faruque. See no evil: Adversarial attacks against linguistic- visual association in referring multi-object tracking systems. arXiv preprint arXiv:2509.02028, 2025. 1

  5. [5]

    Out of sight, out of track: Adversarial attacks on propagation-based multi- object trackers via query state manipulation

    Halima Bouzidi, Haoyu Liu, Yonatan Achamyeleh, Praneet- sai Iddamsetty, and Mohammad Al Faruque. Out of sight, out of track: Adversarial attacks on propagation-based multi- object trackers via query state manipulation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, pages 13326–13335, 2026. 1

  6. [6]

    Adversarial Patch

    Tom B Brown, Dandelion Man ´e, Aurko Roy, Mart´ın Abadi, and Justin Gilmer. Adversarial patch.arXiv preprint arXiv:1712.09665, 2017. 2

  7. [7]

    NuPlan: A closed-loop ML-based planning benchmark for autonomous vehicles

    Holger Caesar, Juraj Kabzan, Kok Seang Tan, Whye Kit Fong, Eric Wolff, Alex Lang, Luke Fletcher, Oscar Beijbom, and Sammy Omari. nuplan: A closed-loop ml-based plan- ning benchmark for autonomous vehicles.arXiv preprint arXiv:2106.11810, 2021. 6

  8. [8]

    Dynamic adversarial at- tacks on autonomous driving systems.arXiv preprint arXiv:2312.06701, 2023

    Amirhosein Chahe, Chenan Wang, Abhishek Jeyapratap, Kaidi Xu, and Lifeng Zhou. Dynamic adversarial at- tacks on autonomous driving systems.arXiv preprint arXiv:2312.06701, 2023. 2

  9. [9]

    Diffusion pol- icy attacker: Crafting adversarial attacks for diffusion-based policies.Advances in Neural Information Processing Sys- tems, 37:119614–119637, 2024

    Yipu Chen, Haotian Xue, and Yongxin Chen. Diffusion pol- icy attacker: Crafting adversarial attacks for diffusion-based policies.Advances in Neural Information Processing Sys- tems, 37:119614–119637, 2024. 1, 2, 5, 8, 7

  10. [10]

    Diffusion policy: Visuomotor policy learning via action dif- fusion.The International Journal of Robotics Research, 44 (10-11):1684–1704, 2025

    Cheng Chi, Zhenjia Xu, Siyuan Feng, Eric Cousineau, Yilun Du, Benjamin Burchfiel, Russ Tedrake, and Shuran Song. Diffusion policy: Visuomotor policy learning via action dif- fusion.The International Journal of Robotics Research, 44 (10-11):1684–1704, 2025. 1, 2

  11. [11]

    Openscene: The largest up-to- date 3d occupancy prediction benchmark in autonomous driving.https://github.com/OpenDriveLab/ OpenScene, 2023

    OpenScene Contributors. Openscene: The largest up-to- date 3d occupancy prediction benchmark in autonomous driving.https://github.com/OpenDriveLab/ OpenScene, 2023. 6

  12. [12]

    Parting with misconceptions about learning- based vehicle motion planning

    Daniel Dauner, Marcel Hallgarten, Andreas Geiger, and Kashyap Chitta. Parting with misconceptions about learning- based vehicle motion planning. InConference on Robot Learning, pages 1268–1281. PMLR, 2023. 4

  13. [13]

    Navsim: Data-driven non-reactive autonomous vehicle simulation and benchmarking.Advances in Neural Information Processing Systems, 37:28706–28719, 2024

    Daniel Dauner, Marcel Hallgarten, Tianyu Li, Xinshuo Weng, Zhiyu Huang, Zetong Yang, Hongyang Li, Igor Gilitschenski, Boris Ivanovic, Marco Pavone, et al. Navsim: Data-driven non-reactive autonomous vehicle simulation and benchmarking.Advances in Neural Information Processing Systems, 37:28706–28719, 2024. 2, 4, 5, 6, 7, 9

  14. [14]

    Adversarial policies: Attacking deep reinforcement learning.arXiv preprint arXiv:1905.10615, 2019

    Adam Gleave, Michael Dennis, Cody Wild, Neel Kant, Sergey Levine, and Stuart Russell. Adversarial policies: Attacking deep reinforcement learning.arXiv preprint arXiv:1905.10615, 2019. 1, 2

  15. [15]

    Real time trajectory prediction using deep conditional generative models.IEEE Robotics and Automation Letters, 5(2):970–976, 2020

    Sebastian Gomez-Gonzalez, Sergey Prokudin, Bernhard Sch¨olkopf, and Jan Peters. Real time trajectory prediction using deep conditional generative models.IEEE Robotics and Automation Letters, 5(2):970–976, 2020. 1, 2

  16. [16]

    Diffusion model is an effective planner and data synthesizer for multi- task reinforcement learning.Advances in neural information processing systems, 36:64896–64917, 2023

    Haoran He, Chenjia Bai, Kang Xu, Zhuoran Yang, Weinan Zhang, Dong Wang, Bin Zhao, and Xuelong Li. Diffusion model is an effective planner and data synthesizer for multi- task reinforcement learning.Advances in neural information processing systems, 36:64896–64917, 2023. 1

  17. [17]

    Denoising dif- fusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020

    Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising dif- fusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020. 1, 2

  18. [18]

    Planning-oriented autonomous driving

    Yihan Hu, Jiazhi Yang, Li Chen, Keyu Li, Chonghao Sima, Xizhou Zhu, Siqi Chai, Senyao Du, Tianwei Lin, Wenhai Wang, et al. Planning-oriented autonomous driving. InPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 17853–17862, 2023. 2

  19. [19]

    Multimodal deep generative models for tra- jectory prediction: A conditional variational autoencoder ap- proach.IEEE Robotics and Automation Letters, 6(2):295– 302, 2020

    Boris Ivanovic, Karen Leung, Edward Schmerling, and Marco Pavone. Multimodal deep generative models for tra- jectory prediction: A conditional variational autoencoder ap- proach.IEEE Robotics and Automation Letters, 6(2):295– 302, 2020. 1, 2

  20. [20]

    Planning with Diffusion for Flexible Behavior Synthesis

    Michael Janner, Yilun Du, Joshua B Tenenbaum, and Sergey Levine. Planning with diffusion for flexible behavior synthe- sis.arXiv preprint arXiv:2205.09991, 2022. 1, 2

  21. [21]

    Vad: Vectorized scene representa- tion for efficient autonomous driving

    Bo Jiang, Shaoyu Chen, Qing Xu, Bencheng Liao, Jiajie Chen, Helong Zhou, Qian Zhang, Wenyu Liu, Chang Huang, and Xinggang Wang. Vad: Vectorized scene representa- tion for efficient autonomous driving. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 8340–8350, 2023. 2

  22. [22]

    Motiondiffuser: Controllable multi-agent motion prediction using diffusion

    Chiyu Jiang, Andre Cornman, Cheolho Park, Benjamin Sapp, Yin Zhou, Dragomir Anguelov, et al. Motiondiffuser: Controllable multi-agent motion prediction using diffusion. InProceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition, pages 9644–9653, 2023. 1, 2

  23. [23]

    Adam: A Method for Stochastic Optimization

    Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980,

  24. [24]

    Con- ditional generative neural system for probabilistic trajectory prediction

    Jiachen Li, Hengbo Ma, and Masayoshi Tomizuka. Con- ditional generative neural system for probabilistic trajectory prediction. In2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 6150–6156. IEEE, 2019. 1

  25. [25]

    Hydra-mdp++: Advancing end-to-end driving via expert-guided hydra-distillation.arXiv preprint arXiv:2503.12820, 2025

    Kailin Li, Zhenxin Li, Shiyi Lan, Yuan Xie, Zhizhong Zhang, Jiayi Liu, Zuxuan Wu, Zhiding Yu, and Jose M Alvarez. Hydra-mdp++: Advancing end-to-end driv- ing via expert-guided hydra-distillation.arXiv preprint arXiv:2503.12820, 2025. 4 9

  26. [26]

    Crossway diffusion: Improving diffusion-based vi- suomotor policy via self-supervised learning

    Xiang Li, Varun Belagali, Jinghuan Shang, and Michael S Ryoo. Crossway diffusion: Improving diffusion-based vi- suomotor policy via self-supervised learning. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 16841–16849. IEEE, 2024. 1, 2

  27. [27]

    Generalized trajectory scoring for end-to-end multimodal planning

    Zhenxin Li, Wenhao Yao, Zi Wang, Xinglong Sun, Joshua Chen, Nadine Chang, Maying Shen, Zuxuan Wu, Shiyi Lan, and Jose M Alvarez. Generalized trajectory scor- ing for end-to-end multimodal planning.arXiv preprint arXiv:2506.06664, 2025. 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12

  28. [28]

    Diffusiondrive: Truncated diffusion model for end-to-end autonomous driving

    Bencheng Liao, Shaoyu Chen, Haoran Yin, Bo Jiang, Cheng Wang, Sixu Yan, Xinbang Zhang, Xiangyu Li, Ying Zhang, Qian Zhang, et al. Diffusiondrive: Truncated diffusion model for end-to-end autonomous driving. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 12037–12047, 2025. 1, 2, 3, 4, 7, 8, 10, 11

  29. [29]

    Ddm-lag: A diffusion-based decision-making model for autonomous vehicles with lagrangian safety en- hancement.IEEE Transactions on Artificial Intelligence, 6 (3):780–791, 2024

    Jiaqi Liu, Peng Hang, Xiaocong Zhao, Jianqiang Wang, and Jian Sun. Ddm-lag: A diffusion-based decision-making model for autonomous vehicles with lagrangian safety en- hancement.IEEE Transactions on Artificial Intelligence, 6 (3):780–791, 2024. 1

  30. [30]

    DPatch: An Adversarial Patch Attack on Object Detectors

    Xin Liu, Huanrui Yang, Ziwei Liu, Linghao Song, Hai Li, and Yiran Chen. Dpatch: An adversarial patch attack on object detectors.arXiv preprint arXiv:1806.02299, 2018. 2

  31. [31]

    Pot potential based diffusion motion planning ential based diffusion motion planning

    Yunhao Luo, Chen Sun, Joshua B Tenenbaum, and Yilun Du. Pot potential based diffusion motion planning ential based diffusion motion planning. InProceedings of the 41st In- ternational Conference on Machine Learning, pages 33486– 33510, 2024. 1, 2

  32. [32]

    Towards deep learn- ing models resistant to adversarial attacks

    Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learn- ing models resistant to adversarial attacks. InInternational Conference on Learning Representations, 2018. 4, 5

  33. [33]

    Generative skill chaining: Long-horizon skill planning with diffusion models

    Utkarsh Aashu Mishra, Shangjie Xue, Yongxin Chen, and Danfei Xu. Generative skill chaining: Long-horizon skill planning with diffusion models. InConference on Robot Learning, pages 2905–2925. PMLR, 2023. 1, 2

  34. [34]

    Attack- ing deep reinforcement learning with decoupled adversarial policy.IEEE Transactions on Dependable and Secure Com- puting, 20(1):758–768, 2022

    Kanghua Mo, Weixuan Tang, Jin Li, and Xu Yuan. Attack- ing deep reinforcement learning with decoupled adversarial policy.IEEE Transactions on Dependable and Secure Com- puting, 20(1):758–768, 2022. 1, 2

  35. [35]

    An algorithmic per- spective on imitation learning.Foundations and Trends® in Robotics, 7(1-2):1–179, 2018

    Takayuki Osa, Joni Pajarinen, Gerhard Neumann, J Andrew Bagnell, Pieter Abbeel, and Jan Peters. An algorithmic per- spective on imitation learning.Foundations and Trends® in Robotics, 7(1-2):1–179, 2018. 2

  36. [36]

    Agile autonomous driving using end-to-end deep imitation learn- ing

    Yunpeng Pan, Ching-An Cheng, Kamil Saigol, Keuntak Lee, Xinyan Yan, Evangelos Theodorou, and Byron Boots. Agile autonomous driving using end-to-end deep imitation learn- ing. InRobotics: science and systems, 2018. 2

  37. [37]

    Robust deep reinforcement learning with adversarial attacks

    Anay Pattanaik, Zhenyi Tang, Shuijing Liu, Gautham Bom- mannan, and Girish Chowdhary. Robust deep reinforcement learning with adversarial attacks. InProceedings of the 17th International Conference on Autonomous Agents and Multi- Agent Systems, pages 2040–2042, 2018. 1, 2

  38. [38]

    Rapid: Real- time deterministic trajectory planning via diffusion behav- ior priors for safe and efficient autonomous driving.arXiv preprint arXiv:2602.07339, 2026

    Ruturaj Reddy, Hrishav Bakul Barua, Junn Yong Loo, Thanh Thi Nguyen, and Ganesh Krishnasamy. Rapid: Real- time deterministic trajectory planning via diffusion behav- ior priors for safe and efficient autonomous driving.arXiv preprint arXiv:2602.07339, 2026. 1

  39. [39]

    Cosmos-drive- dreams: Scalable synthetic driving data generation with world foundation models.arXiv preprint arXiv:2506.09042,

    Xuanchi Ren, Yifan Lu, Tianshi Cao, Ruiyuan Gao, Shengyu Huang, Amirmojtaba Sabour, Tianchang Shen, Tobias Pfaff, Jay Zhangjie Wu, Runjian Chen, et al. Cosmos-drive- dreams: Scalable synthetic driving data generation with world foundation models.arXiv preprint arXiv:2506.09042,

  40. [40]

    Is imitation learning the route to humanoid robots?Trends in cognitive sciences, 3(6):233–242, 1999

    Stefan Schaal. Is imitation learning the route to humanoid robots?Trends in cognitive sciences, 3(6):233–242, 1999. 2

  41. [41]

    Scene as occupancy

    Chonghao Sima, Wenwen Tong, Tai Wang, Li Chen, Silei Wu, Hanming Deng, Yi Gu, Lewei Lu, Ping Luo, Dahua Lin, and Hongyang Li. Scene as occupancy. 2023. 6

  42. [42]

    Learning structured output representation using deep conditional gen- erative models.Advances in neural information processing systems, 28, 2015

    Kihyuk Sohn, Honglak Lee, and Xinchen Yan. Learning structured output representation using deep conditional gen- erative models.Advances in neural information processing systems, 28, 2015. 1, 2

  43. [43]

    Denois- ing diffusion implicit models

    Jiaming Song, Chenlin Meng, and Stefano Ermon. Denois- ing diffusion implicit models. InInternational Conference on Learning Representations, . 7

  44. [44]

    Score-based generative modeling through stochastic differential equa- tions

    Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Ab- hishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equa- tions. InInternational Conference on Learning Represen- tations, . 1, 2

  45. [45]

    Stealthy and effi- cient adversarial attacks against deep reinforcement learning

    Jianwen Sun, Tianwei Zhang, Xiaofei Xie, Lei Ma, Yan Zheng, Kangjie Chen, and Yang Liu. Stealthy and effi- cient adversarial attacks against deep reinforcement learning. InProceedings of the AAAI conference on artificial intelli- gence, pages 5883–5891, 2020. 1, 2

  46. [46]

    Intriguing properties of neural networks

    Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks.arXiv preprint arXiv:1312.6199, 2013. 1, 2

  47. [47]

    Lipschitz regularity of deep neural networks: analysis and efficient estimation.Ad- vances in neural information processing systems, 31, 2018

    Aladin Virmaux and Kevin Scaman. Lipschitz regularity of deep neural networks: analysis and efficient estimation.Ad- vances in neural information processing systems, 31, 2018. 2

  48. [48]

    Attack end-to-end autonomous driv- ing through module-wise noise

    Lu Wang, Tianyuan Zhang, Yikai Han, Muyang Fang, Ting Jin, and Jiaqi Kang. Attack end-to-end autonomous driv- ing through module-wise noise. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8349–8352, 2024. 2

  49. [49]

    Diffad: A unified diffusion mod- eling approach for autonomous driving.arXiv preprint arXiv:2503.12170, 2025

    Tao Wang, Cong Zhang, Xingguang Qu, Kun Li, Weiwei Liu, and Chang Huang. Diffad: A unified diffusion mod- eling approach for autonomous driving.arXiv preprint arXiv:2503.12170, 2025. 1, 2

  50. [50]

    Robogen: Towards unleashing in- finite data for automated robot learning via generative sim- ulation

    Yufei Wang, Zhou Xian, Feng Chen, Tsun-Hsuan Wang, Yian Wang, Katerina Fragkiadaki, Zackory Erickson, David Held, and Chuang Gan. Robogen: Towards unleashing in- finite data for automated robot learning via generative sim- ulation. InForty-first International Conference on Machine Learning. 1, 2

  51. [51]

    Generative ai for autonomous driving: Frontiers and opportunities.arXiv preprint arXiv:2505.08854, 2025

    Yuping Wang, Shuo Xing, Cui Can, Renjie Li, Hongyuan Hua, Kexin Tian, Zhaobin Mo, Xiangbo Gao, Keshu Wu, Sulong Zhou, et al. Generative ai for autonomous driving: Frontiers and opportunities.arXiv preprint arXiv:2505.08854, 2025. 1 10

  52. [52]

    Adversarial driving: Attacking end-to- end autonomous driving

    Han Wu, Syed Yunas, Sareh Rowlands, Wenjie Ruan, and Johan Wahlstr ¨om. Adversarial driving: Attacking end-to- end autonomous driving. In2023 IEEE intelligent vehicles symposium (IV), pages 1–7. IEEE, 2023. 2

  53. [53]

    Improving transferability of adversarial examples with input diversity, 2019

    Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jianyu Wang, Zhou Ren, and Alan Yuille. Improving transferability of adversarial examples with input diversity, 2019. 12

  54. [54]

    Visual point cloud forecasting enables scalable autonomous driving

    Zetong Yang, Li Chen, Yanan Sun, and Hongyang Li. Visual point cloud forecasting enables scalable autonomous driving. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024. 6

  55. [55]

    Reflexd- iffusion: Reflection-enhanced trajectory planning for high- lateral-acceleration scenarios in autonomous driving.arXiv preprint arXiv:2601.09377, 2026

    Xuemei Yao, Xiao Yang, Jianbin Sun, Liuwei Xie, Xue- bin Shao, Xiyu Fang, Hang Su, and Kewei Yang. Reflexd- iffusion: Reflection-enhanced trajectory planning for high- lateral-acceleration scenarios in autonomous driving.arXiv preprint arXiv:2601.09377, 2026. 1, 2

  56. [56]

    Uniada: Universal adaptive multiobjective adversarial attack for end-to-end autonomous driving systems.IEEE Transactions on Reliability, 73(4): 1892–1906, 2024

    Jingyu Zhang, Jacky Wai Keung, Yan Xiao, Yihan Liao, Yishu Li, and Xiaoxue Ma. Uniada: Universal adaptive multiobjective adversarial attack for end-to-end autonomous driving systems.IEEE Transactions on Reliability, 73(4): 1892–1906, 2024. 2

  57. [57]

    Generative artificial intelli- gence in robotic manipulation: A survey.arXiv preprint arXiv:2503.03464, 2025

    Kun Zhang, Peng Yun, Jun Cen, Junhao Cai, Didi Zhu, Hangjie Yuan, Chao Zhao, Tao Feng, Michael Yu Wang, Qifeng Chen, et al. Generative artificial intelli- gence in robotic manipulation: A survey.arXiv preprint arXiv:2503.03464, 2025. 1

  58. [58]

    Genad: Generative end-to-end au- tonomous driving

    Wenzhao Zheng, Ruiqi Song, Xianda Guo, Chenming Zhang, and Long Chen. Genad: Generative end-to-end au- tonomous driving. InEuropean Conference on Computer Vision, pages 87–104. Springer, 2024. 1

  59. [59]

    Diffusion-based planning for autonomous driving with flexible guidance

    Yinan Zheng, Ruiming Liang, Kexin ZHENG, Jinliang Zheng, Liyuan Mao, Jianxiong Li, Weihao Gu, Rui Ai, Shengbo Eben Li, Xianyuan Zhan, et al. Diffusion-based planning for autonomous driving with flexible guidance. In The Thirteenth International Conference on Learning Rep- resentations. 1, 2

  60. [60]

    Unleashing the Potential of Diffusion Models for End-to-End Autonomous Driving

    Yinan Zheng, Tianyi Tan, Bin Huang, Enguang Liu, Ruim- ing Liang, Jianlin Zhang, Jianwei Cui, Guang Chen, Kun Ma, Hangjun Ye, et al. Unleashing the potential of diffusion models for end-to-end autonomous driving.arXiv preprint arXiv:2602.22801, 2026. 1, 2 11 Off the Rails: Hijacking the Scoring Head in Generative End-to-End Driving Planners with Safety-Vi...

  61. [61]

    The Shared Inference Chain

    Why DERAILcan Break Generative AD? 1 7.1. The Shared Inference Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 7.2. Encoder Sensitivity and Feature Perturbation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 7.3. Perturbation Amplification in Decoders . . . . . . . . . . . . . . . . . . . . . . . . ...

  62. [62]

    Evaluation Benchmark

    Experimental Setup Details 6 8.1. Evaluation Benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 8.2. Target Generative Planning Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 8.3. Digital Pixel-Level Perturbation Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . ....

  63. [63]

    Attacking Different Camera Views

    Additional Analysis and Ablation Studies 10 9.1. Attacking Different Camera Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 9.2. Online Per-Frame vs. Offline Attack Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 9.3. DERAILunder Common Defenses . . . . . . . . . . . . . . . . . . . . . . . . . ...

  64. [64]

    (1), main paper) whose decision margins are small enough that pixel-space and patch-based perturba- tions can flip the selected trajectory from a safe to an unsafe candidate

    Why DERAILcan Break Generative AD? This appendix provides the analytic framework that sup- ports the architectural argument of the main paper: that current generative AD planners share a scoring-head infer- ence chain (Eq. (1), main paper) whose decision margins are small enough that pixel-space and patch-based perturba- tions can flip the selected trajec...

  65. [65]

    Asβ→ ∞, ˆτ β →τ (v∗) (consistency with discrete se- lection)

  66. [66]

    The gradient∇ Zˆτ β exists and is non-zero for all finite β

  67. [67]

    The surrogate gradient∇ Zˆτ β is aligned with the true adversarial direction: perturbations that increase ˆτ β along dangerous directions also tend to switch the scor- ing head to dangerous vocabulary entries. Proof.(a)follows from the well-known property that softmax(βℓ)→e v∗ asβ→ ∞, wheree v∗ is the one-hot vector at the scoring head’s selected index.(b...

  68. [68]

    ˆLisL-smooth:∥∇ ˆL(δ)− ∇ ˆL(δ′)∥2 ≤L∥δ−δ ′∥2

  69. [69]

    The stochastic gradient (using mini-batch of sizeB) has bounded variance:E∥ ˆgB − ∇ ˆL∥2 ≤σ 2/B

  70. [70]

    The perturbation is projected ontoP={δ:∥δ∥ ∞ ≤ϵ} after each step. Then Adam with learning rateη, gradient accumulation overBsamples, and cosine annealing overEepochs achieves: min e∈[E] ∥∇δ ˆL(δ(e))∥2 2 ≤ O ˆL∗ − ˆL(δ(0)) ηE + ηLσ2 B ! , (33) where ˆL∗ = max δ∈P ˆL(δ). Settingη=O(1/ √ E)and B=O( √ E)yieldsϵ-stationarity inE=O(1/ϵ 4)epochs. Practical Impli...

  71. [71]

    Experimental Setup Details This section describes the experimental configuration for evaluating adversarial attacks on generative trajectory plan- ning models, covering the evaluation benchmark, target models, attack implementations, and scoring protocol. 8.1. Evaluation Benchmark We conduct all experiments on the NA VSIM [13] frame- work using thenavtest...

  72. [72]

    Attacking Different Camera Views We analyze the impact of camera view (Left, Front, Right) on DERAIL’s digital (pixel-level) attack effectiveness

    Additional Analysis and Ablation Studies 9.1. Attacking Different Camera Views We analyze the impact of camera view (Left, Front, Right) on DERAIL’s digital (pixel-level) attack effectiveness. Ta- ble 6 reveals a consistent and pronounced hierarchy in at- tack effectiveness across the three camera positions. At- tacking the front camera (Cam Front) yields...