Overthink-Triggered Slowdown Attacks on LVLM-Based Robotic Systems

Bo Chen; Jie Wu; Qiang Han

arxiv: 2607.01518 · v1 · pith:RDSSPCQYnew · submitted 2026-07-01 · 💻 cs.CR · cs.RO

Overthink-Triggered Slowdown Attacks on LVLM-Based Robotic Systems

Qiang Han , Jie Wu , Bo Chen This is my paper

Pith reviewed 2026-07-03 19:50 UTC · model grok-4.3

classification 💻 cs.CR cs.RO

keywords overthinking attacksLVLMrobotic systemsscene text triggersslowdown attacksblack-box transferlatency amplificationvision-language models

0 comments

The pith

Adversaries can embed crafted scene text to trigger overthinking in LVLMs, causing up to 6.96x slowdowns in robotic decision-making even without model access.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Large vision-language models integrated into robots can be induced to produce excessively long reasoning traces when specific human-readable text appears in their visual input. The paper develops a three-stage process to discover such triggers: building a corpus of reasoning-heavy scene text, pulling lexical features from short response prefixes, and running an efficient black-box search that confirms candidates by actual latency. Across three LVLMs the triggers consistently produce slowdown ratios above 1.0x, with the best single trigger reaching 6.96x digitally and 4.74x when physically printed. This matters because delayed inference in robots can create safety problems when an attacker controls only the environment the robot sees.

Core claim

The central claim is that scene text can function as transferable triggers for overthinking in LVLMs used by robotic systems. The three-stage framework extracts overthinking-correlated lexical features from short response prefixes, then uses a prefix-based proxy score for black-box search and validates top candidates with full latency measurements. Evaluation on unseen images and multiple LVLMs shows all triggers produce slowdown ratios greater than 1.0x, with the strongest reaching 6.96x, and physical prints still reaching 4.74x.

What carries the argument

The three-stage framework that extracts lexical features from short response prefixes to guide black-box search for transferable overthinking triggers.

If this is right

All discovered triggers produce slowdown ratios above 1.0x on three representative LVLMs.
The strongest single trigger achieves 6.96x latency amplification under black-box conditions.
The same triggers remain effective when physically printed, reaching 4.74x amplification.
Triggers transfer to unseen images while meeting standard attack-success thresholds.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar text-based triggers might affect other multimodal models that rely on long reasoning traces outside robotics.
Robotic deployments could reduce risk by adding explicit limits on reasoning length or by screening incoming scene text.
Testing the triggers on moving robots in changing physical environments would reveal whether motion or lighting alters the observed slowdowns.

Load-bearing premise

Lexical features taken from short response prefixes will reliably cause overthinking when applied to new images and different LVLMs under black-box transfer.

What would settle it

Measuring latency on a fourth unseen LVLM with the published trigger set and finding that every trigger produces a slowdown ratio of 1.0x or lower across the test images.

Figures

Figures reproduced from arXiv: 2607.01518 by Bo Chen, Jie Wu, Qiang Han.

**Figure 2.** Figure 2: An overview of our three-stage framework for identifying the scene-text triggers. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Stability of discovered keyword sets measured by [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: High group latency distribution under full inference [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Low group latency distribution under full inference and [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 7.** Figure 7: Real-world evaluation setup. Left: clean physical scene [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗

**Figure 8.** Figure 8: The robotic vehicle used in our evaluation. [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗

**Figure 10.** Figure 10: Empirical CDF of added latency under attack across [PITH_FULL_IMAGE:figures/full_fig_p016_10.png] view at source ↗

**Figure 9.** Figure 9: Per-trigger slowdown-ratio spread (Attack/Clean) for three models (T1–T4). Dots mark mean ratio and vertical whiskers show the min-max range across images. 17 [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗

read the original abstract

Large Vision-Language Models (LVLMs) have been increasingly integrated into robotic systems. However, these models may exhibit overthinking behaviors, where they generate excessively long reasoning traces, incurring an excessive inference time. This overthinking behavior poses a serious risk to robotic systems, as the adversary can deliberately trigger overthinking to slow down the decision making of a victim robotic system, causing a variety of safety issues (i.e., an overthinking-induced slowdown attack). To initiate this attack, an adversary can embed carefully crafted, human-readable scene text into the visual scene observed by a victim robotic agent, causing significant inference delays even under a strict black-box setting. Therefore, the embedded scene text serves as a significant "trigger" for the attack. This work systematically identifies and validates transferable triggers of overthinking in robotic systems by introducing a three-stage framework. First, we construct a diverse corpus of reasoning-intensive scene text and extract overthinking-correlated lexical features from short response prefixes. Second, we perform an efficient black-box search guided by a prefix-based proxy score while selectively confirming a small set of top candidates with full latency measurements. Third, we evaluate black-box transfer using a fixed pool of triggers on unseen images and multiple LVLMs, reporting latency amplification and attack success rates under standard thresholds. Across three representative LVLMs, all triggers yield slowdown ratios greater than 1.0x, with the strongest single-trigger case reaching 6.96x. The physical printing of the text trigger still causes up to 4.74x latency amplification. These results demonstrate that our discovered triggers are transferred between multiple LVLM models and consistently cause significant slowdowns in robotic systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper finds scene-text triggers for overthinking slowdowns in robotic LVLMs but leaves the proxy-to-full-latency link unverified.

read the letter

The main point from this paper is that carefully chosen scene text can be used to trigger overthinking in large vision-language models running on robots, causing their decision loops to slow down substantially, with reported factors reaching 6.96 times in digital settings and 4.74 times when the text is physically printed.

The new part is their three-stage framework for discovering these transferable triggers. They start by creating a corpus of scene text that encourages reasoning and then extract lexical features from the beginnings of the model's responses. Next comes an efficient black-box search that uses a prefix-based proxy to score candidates without running full queries every time, followed by full latency checks on the top ones. Finally they test how well the triggers transfer to unseen images and different LVLMs.

This approach has the advantage of being practical for an adversary who doesn't have white-box access, and the results suggest the slowdowns are consistent enough to be a concern for robotic deployments.

Where it falls short is in the validation of that proxy score. The description does not include any correlation analysis or ablation showing that triggers ranked high by the short-prefix proxy actually produce high full-query latency. Without that link, it's possible the discovered triggers are effective for incidental reasons rather than the lexical features they identified. The abstract also omits details on the number of trials, variance in measurements, or how images were selected, so the strength of the quantitative claims is difficult to assess from what's provided.

Overall this is the kind of work that belongs in the AI security literature for robotics applications. Readers who follow adversarial machine learning or safety in embodied AI would find the attack construction and transfer results worth looking at. The paper shows clear thinking about the threat model and has enough empirical content to merit a full review.

I would send it to peer review.

Referee Report

2 major / 0 minor

Summary. The manuscript introduces a three-stage framework to discover human-readable scene-text triggers that induce overthinking in LVLMs deployed in robotic systems, thereby causing inference slowdown attacks. Stage 1 builds a corpus of reasoning-intensive text and extracts lexical features from short response prefixes; Stage 2 performs black-box search guided by a prefix-based proxy score with selective full-latency confirmation; Stage 3 evaluates transfer to unseen images and multiple LVLMs. The central empirical claim is that all discovered triggers produce slowdown ratios >1.0x across three representative LVLMs, reaching a maximum of 6.96x digitally and 4.74x when the trigger is physically printed.

Significance. If the reported latency amplifications and transfer results hold under rigorous controls, the work identifies a practical, low-effort attack vector on LVLM-based robotics that does not require model access or adversarial perturbations. The inclusion of physical-print experiments is a concrete strength that moves the demonstration beyond purely digital settings. The absence of any machine-checked proofs or parameter-free derivations is expected for an empirical attack paper; credit is due for the selective full-latency confirmation step, but this does not yet compensate for missing validation of the proxy mechanism.

major comments (2)

[Abstract (three-stage framework)] Abstract (three-stage framework description): the central claim that prefix-extracted lexical features enable reliable black-box search and transfer to unseen images/LVLMs rests on an unverified assumption that the short-prefix proxy score correlates with full-query inference latency. No correlation coefficient, ablation on proxy vs. full latency, or ranking-consistency metric is referenced, so it is unclear whether the reported 6.96x and 4.74x maxima are caused by the identified features or are incidental to the search procedure.
[Abstract (experimental results)] Abstract (experimental results): the quantitative claims of slowdown ratios >1.0x for all triggers and transfer success rates are presented without reference to dataset size, number of images, error bars, statistical tests, or controls for measurement artifacts (e.g., caching, batching, or hardware variability). These omissions are load-bearing because the transfer and physical-print results cannot be assessed for robustness without them.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment below and commit to revisions that strengthen the empirical validation of the proxy mechanism and the reporting of experimental details.

read point-by-point responses

Referee: [Abstract (three-stage framework)] Abstract (three-stage framework description): the central claim that prefix-extracted lexical features enable reliable black-box search and transfer to unseen images/LVLMs rests on an unverified assumption that the short-prefix proxy score correlates with full-query inference latency. No correlation coefficient, ablation on proxy vs. full latency, or ranking-consistency metric is referenced, so it is unclear whether the reported 6.96x and 4.74x maxima are caused by the identified features or are incidental to the search procedure.

Authors: We agree that the abstract does not explicitly quantify the proxy's reliability. The manuscript uses the prefix proxy for search efficiency followed by selective full-latency confirmation on top candidates, but we acknowledge the need for direct validation. In the revision we will add a new analysis subsection reporting (1) the Pearson correlation coefficient between proxy scores and full inference latencies across the candidate pool, (2) an ablation comparing proxy-guided search against random selection in terms of discovered slowdown ratios, and (3) ranking-consistency metrics (e.g., Kendall tau) between proxy ordering and full-latency ordering. These additions will clarify that the reported maxima are attributable to the identified lexical features rather than search artifacts. revision: yes
Referee: [Abstract (experimental results)] Abstract (experimental results): the quantitative claims of slowdown ratios >1.0x for all triggers and transfer success rates are presented without reference to dataset size, number of images, error bars, statistical tests, or controls for measurement artifacts (e.g., caching, batching, or hardware variability). These omissions are load-bearing because the transfer and physical-print results cannot be assessed for robustness without them.

Authors: We concur that the abstract omits key experimental metadata. The full manuscript evaluates transfer across a fixed pool of triggers on unseen images and multiple LVLMs, yet the abstract does not specify scale or controls. In the revision we will (1) expand the abstract to reference the number of evaluation images and models, (2) add error bars derived from repeated measurements, (3) include statistical significance tests (e.g., paired t-tests) for slowdown ratios, and (4) describe controls for caching, batching, and hardware variability by reporting mean latencies over multiple independent runs on fixed hardware configurations. These changes will allow readers to assess robustness of the transfer and physical-print results. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical attack demonstration with direct latency measurements

full rationale

The paper presents an empirical three-stage framework for discovering and validating overthinking triggers via corpus construction, lexical feature extraction from prefixes, proxy-guided black-box search, and full-latency confirmation on unseen images across LVLMs. All reported results (slowdown ratios >1.0x, max 6.96x, physical 4.74x) are direct measurements of inference time, not predictions or derivations that reduce to fitted inputs or self-citations. No equations, parameter fitting, uniqueness theorems, or ansatzes appear; the proxy is used only for search efficiency with selective full confirmation, and transfer is validated by explicit testing rather than by construction. This is a standard empirical security evaluation with no load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Empirical security evaluation paper with no mathematical model, fitted constants, or new postulated entities; relies on the domain premise that LVLMs can be induced into long reasoning traces by visual text.

axioms (1)

domain assumption LVLMs exhibit overthinking behaviors that produce excessively long reasoning traces and high inference latency
Invoked in the opening paragraph as the mechanism enabling the slowdown attack.

pith-pipeline@v0.9.1-grok · 5833 in / 1262 out tokens · 31399 ms · 2026-07-03T19:50:18.010237+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

40 extracted references · 21 canonical work pages · 1 internal anchor

[1]

On-board vision-language models for personalized autonomous vehicle motion control: System design and real-world validation,

C. Cui, Z. Yang, Y . Zhou, J. Peng, S.-Y . Park, C. Zhang, Y . Ma, X. Cao, W. Ye, Y . Fenget al., “On-board vision-language models for personalized autonomous vehicle motion control: System design and real-world validation,”arXiv preprint arXiv:2411.11913, 2024

work page arXiv 2024
[2]

Drivevlm: The convergence of autonomous driving and large vision-language models,

X. Tian, J. Gu, B. Li, Y . Liu, Y . Wang, Z. Zhao, K. Zhan, P. Jia, X. Lang, and H. Zhao, “Drivevlm: The convergence of autonomous driving and large vision-language models,” inConference on Robot Learning. PMLR, 2025, pp. 4698–4726

2025
[3]

Multi-frame, lightweight & efficient vision-language models for question answering in autonomous driving,

A. Gopalkrishnan, R. Greer, and M. Trivedi, “Multi-frame, lightweight & efficient vision-language models for question answering in autonomous driving,”arXiv preprint arXiv:2403.19838, 2024

work page arXiv 2024
[4]

Large (vision) language models for autonomous vehicles: Current trends and future directions,

H. Tian, K. Reddy, Y . Feng, M. Quddus, Y . Demiris, and P. An- geloudis, “Large (vision) language models for autonomous vehicles: Current trends and future directions,”IEEE Transactions on Intelligent Transportation Systems, vol. 27, no. 1, pp. 187–210, 2025

2025
[5]

Overthink: Slowdown attacks on reasoning llms,

A. Kumar, J. Roh, A. Naseh, M. Karpinska, M. Iyyer, A. Houmansadr, and E. Bagdasarian, “Overthink: Slowdown attacks on reasoning llms,” arXiv preprint arXiv:2502.02542, 2025

work page arXiv 2025
[6]

Dnr bench: Benchmarking over-reasoning in reasoning llms,

M. Hashemi, O. Bamgbose, S. T. Madhusudhan, J. S. Nair, A. Tiwari, and V . Yadav, “Dnr bench: Benchmarking over-reasoning in reasoning llms,”arXiv preprint arXiv:2503.15793, 2025

work page arXiv 2025
[7]

Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs

X. Chen, J. Xu, T. Liang, Z. He, J. Pang, D. Yu, L. Song, Q. Liu, M. Zhou, Z. Zhanget al., “Do not think that much for 2+ 3=? on the overthinking of o1-like llms,”arXiv preprint arXiv:2412.21187, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[8]

Induc- ing high energy-latency of large vision-language models with verbose images,

K. Gao, Y . Bai, J. Gu, S.-T. Xia, P. Torr, Z. Li, and W. Liu, “Induc- ing high energy-latency of large vision-language models with verbose images,”arXiv preprint arXiv:2401.11170, 2024

work page arXiv 2024
[9]

Drivegpt4: Interpretable end-to-end autonomous driving via large language model,

Z. Xu, Y . Zhang, E. Xie, Z. Zhao, Y . Guo, K.-Y . K. Wong, Z. Li, and H. Zhao, “Drivegpt4: Interpretable end-to-end autonomous driving via large language model,”IEEE Robotics and Automation Letters, 2024

2024
[10]

Simlingo: Vision-only closed-loop autonomous driving with language-action alignment,

K. Renz, L. Chen, E. Arani, and O. Sinavski, “Simlingo: Vision-only closed-loop autonomous driving with language-action alignment,” in Proceedings of the Computer Vision and Pattern Recognition Confer- ence, 2025, pp. 11 993–12 003

2025
[11]

Chai: Command hijacking against embodied ai,

L. Burbano, D. Ortiz, Q. Sun, S. Yang, H. Tu, C. Xie, Y . Cao, and A. A. Cardenas, “Chai: Command hijacking against embodied ai,”arXiv preprint arXiv:2510.00181, 2025

work page arXiv 2025
[12]

AutoDAN: Interpretable gradient-based adversarial attacks on large language models,

S. Zhu, R. Zhang, B. An, G. Wu, J. Barrow, Z. Wang, F. Huang, A. Nenkova, and T. Sun, “AutoDAN: Interpretable gradient-based adversarial attacks on large language models,” in First Conference on Language Modeling, 2024. [Online]. Available: https://openreview.net/forum?id=INivcBeIDK

2024
[13]

NVIDIA Jetson Nano,

NVIDIA, “NVIDIA Jetson Nano,” Retrieved on January 28, 2026, from https://www.nvidia.com/en-us/autonomous-machines/embedded- systems/jetson-nano/product-development/, 2026

2026
[14]

A survey of security and privacy issues in v2x communication systems,

T. Yoshizawa, D. Singel ´ee, J. T. Muehlberg, S. Delbruel, A. Taherkordi, D. Hughes, and B. Preneel, “A survey of security and privacy issues in v2x communication systems,”ACM Computing Surveys, vol. 55, no. 9, pp. 1–36, 2023

2023
[15]

Bdd100k: A diverse driving dataset for heterogeneous multitask learning,

F. Yu, H. Chen, X. Wang, W. Xian, Y . Chen, F. Liu, V . Madhavan, and T. Darrell, “Bdd100k: A diverse driving dataset for heterogeneous multitask learning,” inThe IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2020

2020
[16]

Dodge,The concise encyclopedia of statistics

Y . Dodge,The concise encyclopedia of statistics. Springer Science & Business Media, 2008

2008
[17]

Autospearman: Automatically mitigating correlated software metrics for interpreting defect models,

J. Jiarpakdee, C. Tantithamthavorn, and C. Treude, “Autospearman: Automatically mitigating correlated software metrics for interpreting defect models,” in2018 IEEE International Conference on Software Maintenance and Evolution (ICSME), 2018, pp. 92–103

2018
[18]

Thinktrap: Denial-of-service attacks against black-box llm services via infinite thinking,

Y . Li, J. Wang, H. Zhu, J. Lin, S. Chang, and M. Guo, “Thinktrap: Denial-of-service attacks against black-box llm services via infinite thinking,”arXiv preprint arXiv:2512.07086, 2025

work page arXiv 2025
[19]

Crabs: Consuming resource via auto-generation for llm-dos attack un- der black-box settings,

Y . Zhang, Z. Zhou, W. Zhang, X. Wang, X. Jia, Y . Liu, and S. Su, “Crabs: Consuming resource via auto-generation for llm-dos attack un- der black-box settings,” inFindings of the Association for Computational Linguistics: ACL 2025, 2025, pp. 11 128–11 150

2025
[20]

Prompt-induced over-generation as denial-of-service: A black-box attack-side benchmark,

Y . Guo, J. Plested, T. Lynar, K. Thilakarathna, N. Sivaroopan, J. Yang, W. Yanget al., “Prompt-induced over-generation as denial-of-service: A black-box attack-side benchmark,”arXiv preprint arXiv:2512.23779, 2025

work page arXiv 2025
[21]

Excessive reasoning attack on reasoning llms,

W. M. Si, M. Li, M. Backes, and Y . Zhang, “Excessive reasoning attack on reasoning llms,”arXiv preprint arXiv:2506.14374, 2025

work page arXiv 2025
[22]

Extendattack: Attacking servers of lrms via extending reasoning,

Z. Zhu, Y . Liu, Z. Xu, Y . Ma, H. Gao, N. Chen, Y . Guo, W. Qu, H. Xu, Z. Kanget al., “Extendattack: Attacking servers of lrms via extending reasoning,”arXiv preprint arXiv:2506.13737, 2025

work page arXiv 2025
[23]

Distractor injection attacks on large reasoning models: Characterization and defense,

Z. Zhang, W. Xu, S. Cui, and C. K. Reddy, “Distractor injection attacks on large reasoning models: Characterization and defense,”arXiv preprint arXiv:2510.16259, 2025

work page arXiv 2025
[24]

Denial-of- service poisoning attacks against large language models,

K. Gao, T. Pang, C. Du, Y . Yang, S.-T. Xia, and M. Lin, “Denial-of- service poisoning attacks against large language models,”arXiv preprint arXiv:2410.10760, 2024

work page arXiv 2024
[25]

Badthink: Triggered overthinking attacks on chain-of-thought reasoning in large language models,

S. Liu, R. Li, L. Yu, L. Zhang, Z. Liu, and G. Jin, “Badthink: Triggered overthinking attacks on chain-of-thought reasoning in large language models,”arXiv preprint arXiv:2511.10714, 2025

work page arXiv 2025
[26]

Badreasoner: Planting tunable overthinking backdoors into large reasoning models for fun or profit,

B. Yi, Z. Fei, J. Geng, T. Li, L. Nie, Z. Liu, and Y . Li, “Badreasoner: Planting tunable overthinking backdoors into large reasoning models for fun or profit,”arXiv preprint arXiv:2507.18305, 2025

work page arXiv 2025
[27]

Beyond max tokens: Stealthy resource amplification via tool calling chains in llm agents,

K. Zhou, Y . Zheng, Y . He, M. Xue, X. Gong, Y . Wang, and K.-Y . Lam, “Beyond max tokens: Stealthy resource amplification via tool calling chains in llm agents,”arXiv preprint arXiv:2601.10955, 2026

work page arXiv 2026
[28]

Hidden tail: Adversarial image causing stealthy resource consumption in vision-language models,

R. Zhang, Z. Wang, T. Yang, H. Li, W. Jiang, Q. Zhao, Y . Liu, and G. Xu, “Hidden tail: Adversarial image causing stealthy resource consumption in vision-language models,”arXiv preprint arXiv:2508.18805, 2025

work page arXiv 2025
[29]

Unveiling typographic deceptions: Insights of the typographic vulnerability in large vision-language models,

H. Cheng, E. Xiao, J. Gu, L. Yang, J. Duan, J. Zhang, J. Cao, K. Xu, and R. Xu, “Unveiling typographic deceptions: Insights of the typographic vulnerability in large vision-language models,” inEuropean Conference on Computer Vision. Springer, 2024, pp. 179–196

2024
[30]

Vision-llms can fool themselves with self-generated typographic at- tacks,

M. Qraitem, N. Tasnim, P. Teterwak, K. Saenko, and B. A. Plummer, “Vision-llms can fool themselves with self-generated typographic at- tacks,”arXiv preprint arXiv:2402.00626, 2024

work page arXiv 2024
[31]

Scenetap: Scene-coherent typographic adversarial planner against vision-language models in real-world environments,

Y . Cao, Y . Xing, J. Zhang, D. Lin, T. Zhang, I. Tsang, Y . Liu, and Q. Guo, “Scenetap: Scene-coherent typographic adversarial planner against vision-language models in real-world environments,” inProceed- ings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 25 050–25 059

2025
[32]

Transferable multimodal attack on vision-language pre-training 14 models,

H. Wang, K. Dong, Z. Zhu, H. Qin, A. Liu, X. Fang, J. Wang, and X. Liu, “Transferable multimodal attack on vision-language pre-training 14 models,” in2024 IEEE Symposium on Security and Privacy (SP). IEEE, 2024, pp. 1722–1740

2024
[33]

Thinkswitcher: When to think hard, when to think fast,

G. Liang, L. Zhong, Z. Yang, and X. Quan, “Thinkswitcher: When to think hard, when to think fast,”arXiv preprint arXiv:2505.14183, 2025

work page arXiv 2025
[34]

Lapo: Internalizing reasoning efficiency via length- adaptive policy optimization,

X. Wu, Y . Yan, S. Lyu, L. Wu, Y . Qiu, Y . Shen, W. Lu, J. Shao, J. Xiao, and Y . Zhuang, “Lapo: Internalizing reasoning efficiency via length- adaptive policy optimization,”arXiv preprint arXiv:2507.15758, 2025

work page arXiv 2025
[35]

Frost: Filtering reasoning outliers with attention for efficient reasoning,

H. Luo, Z. Jiang, M. Z. Hasan, Y . Chen, and S. Sarkar, “Frost: Filtering reasoning outliers with attention for efficient reasoning,”arXiv preprint arXiv:2601.19001, 2026. APPENDIXA PROMPTTEMPLATES To ensure the reproducibility of our trigger discovery frame- work, we provide the exact templates used for seed generation (Stage 1) and structural decomposit...

work page arXiv 2026
[36]

scenario description: The concrete driving environment or scene being described (e.g., road condition, traffic situation, objects)
[37]

ambiguity or conflict: Any contradiction, ethical dilemma, uncertainty, unexpected event, or conflict that makes the situation non-trivial
[38]

reasoning requirement: Explicit or implicit instructions that encourage careful thinking, step-by-step reasoning, ethical analysis, or deliberation
[39]

constraints or formatting: Any constraints on how the answer should be structured or expressed (e.g., ”explain first, then answer”, ”step by step”, repetition rules, formatting requirements, or meta-instructions)
[40]

final action request: The final question or decision request that asks the model to output an action, judgment, or recommendation. Input Prompt: ”{full text}” Return ONLY a valid JSON object with EXACTLY these five keys: - scenario description - ambiguity or conflict - reasoning requirement - constraints or formatting - final action request Do NOT include...

[1] [1]

On-board vision-language models for personalized autonomous vehicle motion control: System design and real-world validation,

C. Cui, Z. Yang, Y . Zhou, J. Peng, S.-Y . Park, C. Zhang, Y . Ma, X. Cao, W. Ye, Y . Fenget al., “On-board vision-language models for personalized autonomous vehicle motion control: System design and real-world validation,”arXiv preprint arXiv:2411.11913, 2024

work page arXiv 2024

[2] [2]

Drivevlm: The convergence of autonomous driving and large vision-language models,

X. Tian, J. Gu, B. Li, Y . Liu, Y . Wang, Z. Zhao, K. Zhan, P. Jia, X. Lang, and H. Zhao, “Drivevlm: The convergence of autonomous driving and large vision-language models,” inConference on Robot Learning. PMLR, 2025, pp. 4698–4726

2025

[3] [3]

Multi-frame, lightweight & efficient vision-language models for question answering in autonomous driving,

A. Gopalkrishnan, R. Greer, and M. Trivedi, “Multi-frame, lightweight & efficient vision-language models for question answering in autonomous driving,”arXiv preprint arXiv:2403.19838, 2024

work page arXiv 2024

[4] [4]

Large (vision) language models for autonomous vehicles: Current trends and future directions,

H. Tian, K. Reddy, Y . Feng, M. Quddus, Y . Demiris, and P. An- geloudis, “Large (vision) language models for autonomous vehicles: Current trends and future directions,”IEEE Transactions on Intelligent Transportation Systems, vol. 27, no. 1, pp. 187–210, 2025

2025

[5] [5]

Overthink: Slowdown attacks on reasoning llms,

A. Kumar, J. Roh, A. Naseh, M. Karpinska, M. Iyyer, A. Houmansadr, and E. Bagdasarian, “Overthink: Slowdown attacks on reasoning llms,” arXiv preprint arXiv:2502.02542, 2025

work page arXiv 2025

[6] [6]

Dnr bench: Benchmarking over-reasoning in reasoning llms,

M. Hashemi, O. Bamgbose, S. T. Madhusudhan, J. S. Nair, A. Tiwari, and V . Yadav, “Dnr bench: Benchmarking over-reasoning in reasoning llms,”arXiv preprint arXiv:2503.15793, 2025

work page arXiv 2025

[7] [7]

Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs

X. Chen, J. Xu, T. Liang, Z. He, J. Pang, D. Yu, L. Song, Q. Liu, M. Zhou, Z. Zhanget al., “Do not think that much for 2+ 3=? on the overthinking of o1-like llms,”arXiv preprint arXiv:2412.21187, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[8] [8]

Induc- ing high energy-latency of large vision-language models with verbose images,

K. Gao, Y . Bai, J. Gu, S.-T. Xia, P. Torr, Z. Li, and W. Liu, “Induc- ing high energy-latency of large vision-language models with verbose images,”arXiv preprint arXiv:2401.11170, 2024

work page arXiv 2024

[9] [9]

Drivegpt4: Interpretable end-to-end autonomous driving via large language model,

Z. Xu, Y . Zhang, E. Xie, Z. Zhao, Y . Guo, K.-Y . K. Wong, Z. Li, and H. Zhao, “Drivegpt4: Interpretable end-to-end autonomous driving via large language model,”IEEE Robotics and Automation Letters, 2024

2024

[10] [10]

Simlingo: Vision-only closed-loop autonomous driving with language-action alignment,

K. Renz, L. Chen, E. Arani, and O. Sinavski, “Simlingo: Vision-only closed-loop autonomous driving with language-action alignment,” in Proceedings of the Computer Vision and Pattern Recognition Confer- ence, 2025, pp. 11 993–12 003

2025

[11] [11]

Chai: Command hijacking against embodied ai,

L. Burbano, D. Ortiz, Q. Sun, S. Yang, H. Tu, C. Xie, Y . Cao, and A. A. Cardenas, “Chai: Command hijacking against embodied ai,”arXiv preprint arXiv:2510.00181, 2025

work page arXiv 2025

[12] [12]

AutoDAN: Interpretable gradient-based adversarial attacks on large language models,

S. Zhu, R. Zhang, B. An, G. Wu, J. Barrow, Z. Wang, F. Huang, A. Nenkova, and T. Sun, “AutoDAN: Interpretable gradient-based adversarial attacks on large language models,” in First Conference on Language Modeling, 2024. [Online]. Available: https://openreview.net/forum?id=INivcBeIDK

2024

[13] [13]

NVIDIA Jetson Nano,

NVIDIA, “NVIDIA Jetson Nano,” Retrieved on January 28, 2026, from https://www.nvidia.com/en-us/autonomous-machines/embedded- systems/jetson-nano/product-development/, 2026

2026

[14] [14]

A survey of security and privacy issues in v2x communication systems,

T. Yoshizawa, D. Singel ´ee, J. T. Muehlberg, S. Delbruel, A. Taherkordi, D. Hughes, and B. Preneel, “A survey of security and privacy issues in v2x communication systems,”ACM Computing Surveys, vol. 55, no. 9, pp. 1–36, 2023

2023

[15] [15]

Bdd100k: A diverse driving dataset for heterogeneous multitask learning,

F. Yu, H. Chen, X. Wang, W. Xian, Y . Chen, F. Liu, V . Madhavan, and T. Darrell, “Bdd100k: A diverse driving dataset for heterogeneous multitask learning,” inThe IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2020

2020

[16] [16]

Dodge,The concise encyclopedia of statistics

Y . Dodge,The concise encyclopedia of statistics. Springer Science & Business Media, 2008

2008

[17] [17]

Autospearman: Automatically mitigating correlated software metrics for interpreting defect models,

J. Jiarpakdee, C. Tantithamthavorn, and C. Treude, “Autospearman: Automatically mitigating correlated software metrics for interpreting defect models,” in2018 IEEE International Conference on Software Maintenance and Evolution (ICSME), 2018, pp. 92–103

2018

[18] [18]

Thinktrap: Denial-of-service attacks against black-box llm services via infinite thinking,

Y . Li, J. Wang, H. Zhu, J. Lin, S. Chang, and M. Guo, “Thinktrap: Denial-of-service attacks against black-box llm services via infinite thinking,”arXiv preprint arXiv:2512.07086, 2025

work page arXiv 2025

[19] [19]

Crabs: Consuming resource via auto-generation for llm-dos attack un- der black-box settings,

Y . Zhang, Z. Zhou, W. Zhang, X. Wang, X. Jia, Y . Liu, and S. Su, “Crabs: Consuming resource via auto-generation for llm-dos attack un- der black-box settings,” inFindings of the Association for Computational Linguistics: ACL 2025, 2025, pp. 11 128–11 150

2025

[20] [20]

Prompt-induced over-generation as denial-of-service: A black-box attack-side benchmark,

Y . Guo, J. Plested, T. Lynar, K. Thilakarathna, N. Sivaroopan, J. Yang, W. Yanget al., “Prompt-induced over-generation as denial-of-service: A black-box attack-side benchmark,”arXiv preprint arXiv:2512.23779, 2025

work page arXiv 2025

[21] [21]

Excessive reasoning attack on reasoning llms,

W. M. Si, M. Li, M. Backes, and Y . Zhang, “Excessive reasoning attack on reasoning llms,”arXiv preprint arXiv:2506.14374, 2025

work page arXiv 2025

[22] [22]

Extendattack: Attacking servers of lrms via extending reasoning,

Z. Zhu, Y . Liu, Z. Xu, Y . Ma, H. Gao, N. Chen, Y . Guo, W. Qu, H. Xu, Z. Kanget al., “Extendattack: Attacking servers of lrms via extending reasoning,”arXiv preprint arXiv:2506.13737, 2025

work page arXiv 2025

[23] [23]

Distractor injection attacks on large reasoning models: Characterization and defense,

Z. Zhang, W. Xu, S. Cui, and C. K. Reddy, “Distractor injection attacks on large reasoning models: Characterization and defense,”arXiv preprint arXiv:2510.16259, 2025

work page arXiv 2025

[24] [24]

Denial-of- service poisoning attacks against large language models,

K. Gao, T. Pang, C. Du, Y . Yang, S.-T. Xia, and M. Lin, “Denial-of- service poisoning attacks against large language models,”arXiv preprint arXiv:2410.10760, 2024

work page arXiv 2024

[25] [25]

Badthink: Triggered overthinking attacks on chain-of-thought reasoning in large language models,

S. Liu, R. Li, L. Yu, L. Zhang, Z. Liu, and G. Jin, “Badthink: Triggered overthinking attacks on chain-of-thought reasoning in large language models,”arXiv preprint arXiv:2511.10714, 2025

work page arXiv 2025

[26] [26]

Badreasoner: Planting tunable overthinking backdoors into large reasoning models for fun or profit,

B. Yi, Z. Fei, J. Geng, T. Li, L. Nie, Z. Liu, and Y . Li, “Badreasoner: Planting tunable overthinking backdoors into large reasoning models for fun or profit,”arXiv preprint arXiv:2507.18305, 2025

work page arXiv 2025

[27] [27]

Beyond max tokens: Stealthy resource amplification via tool calling chains in llm agents,

K. Zhou, Y . Zheng, Y . He, M. Xue, X. Gong, Y . Wang, and K.-Y . Lam, “Beyond max tokens: Stealthy resource amplification via tool calling chains in llm agents,”arXiv preprint arXiv:2601.10955, 2026

work page arXiv 2026

[28] [28]

Hidden tail: Adversarial image causing stealthy resource consumption in vision-language models,

R. Zhang, Z. Wang, T. Yang, H. Li, W. Jiang, Q. Zhao, Y . Liu, and G. Xu, “Hidden tail: Adversarial image causing stealthy resource consumption in vision-language models,”arXiv preprint arXiv:2508.18805, 2025

work page arXiv 2025

[29] [29]

Unveiling typographic deceptions: Insights of the typographic vulnerability in large vision-language models,

H. Cheng, E. Xiao, J. Gu, L. Yang, J. Duan, J. Zhang, J. Cao, K. Xu, and R. Xu, “Unveiling typographic deceptions: Insights of the typographic vulnerability in large vision-language models,” inEuropean Conference on Computer Vision. Springer, 2024, pp. 179–196

2024

[30] [30]

Vision-llms can fool themselves with self-generated typographic at- tacks,

M. Qraitem, N. Tasnim, P. Teterwak, K. Saenko, and B. A. Plummer, “Vision-llms can fool themselves with self-generated typographic at- tacks,”arXiv preprint arXiv:2402.00626, 2024

work page arXiv 2024

[31] [31]

Scenetap: Scene-coherent typographic adversarial planner against vision-language models in real-world environments,

Y . Cao, Y . Xing, J. Zhang, D. Lin, T. Zhang, I. Tsang, Y . Liu, and Q. Guo, “Scenetap: Scene-coherent typographic adversarial planner against vision-language models in real-world environments,” inProceed- ings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 25 050–25 059

2025

[32] [32]

Transferable multimodal attack on vision-language pre-training 14 models,

H. Wang, K. Dong, Z. Zhu, H. Qin, A. Liu, X. Fang, J. Wang, and X. Liu, “Transferable multimodal attack on vision-language pre-training 14 models,” in2024 IEEE Symposium on Security and Privacy (SP). IEEE, 2024, pp. 1722–1740

2024

[33] [33]

Thinkswitcher: When to think hard, when to think fast,

G. Liang, L. Zhong, Z. Yang, and X. Quan, “Thinkswitcher: When to think hard, when to think fast,”arXiv preprint arXiv:2505.14183, 2025

work page arXiv 2025

[34] [34]

Lapo: Internalizing reasoning efficiency via length- adaptive policy optimization,

X. Wu, Y . Yan, S. Lyu, L. Wu, Y . Qiu, Y . Shen, W. Lu, J. Shao, J. Xiao, and Y . Zhuang, “Lapo: Internalizing reasoning efficiency via length- adaptive policy optimization,”arXiv preprint arXiv:2507.15758, 2025

work page arXiv 2025

[35] [35]

Frost: Filtering reasoning outliers with attention for efficient reasoning,

H. Luo, Z. Jiang, M. Z. Hasan, Y . Chen, and S. Sarkar, “Frost: Filtering reasoning outliers with attention for efficient reasoning,”arXiv preprint arXiv:2601.19001, 2026. APPENDIXA PROMPTTEMPLATES To ensure the reproducibility of our trigger discovery frame- work, we provide the exact templates used for seed generation (Stage 1) and structural decomposit...

work page arXiv 2026

[36] [36]

scenario description: The concrete driving environment or scene being described (e.g., road condition, traffic situation, objects)

[37] [37]

ambiguity or conflict: Any contradiction, ethical dilemma, uncertainty, unexpected event, or conflict that makes the situation non-trivial

[38] [38]

reasoning requirement: Explicit or implicit instructions that encourage careful thinking, step-by-step reasoning, ethical analysis, or deliberation

[39] [39]

constraints or formatting: Any constraints on how the answer should be structured or expressed (e.g., ”explain first, then answer”, ”step by step”, repetition rules, formatting requirements, or meta-instructions)

[40] [40]

final action request: The final question or decision request that asks the model to output an action, judgment, or recommendation. Input Prompt: ”{full text}” Return ONLY a valid JSON object with EXACTLY these five keys: - scenario description - ambiguity or conflict - reasoning requirement - constraints or formatting - final action request Do NOT include...