High-Entropy Tokens as Multimodal Failure Points in Vision-Language Models

· 2025 · cs.CV · arXiv 2512.21815

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Vision-language models (VLMs) achieve remarkable performance but remain vulnerable to adversarial attacks. Entropy, as a measure of model uncertainty, is highly correlated with VLM reliability. While prior entropy-based attacks maximize uncertainty at all decoding steps, implicitly assuming that every token equally contributes to model instability, we reveal that a small fraction (around 20%) of high-entropy tokens, in the evaluated representative open-source VLMs with diverse architectures, concentrates a disproportionate share of adversarial influence during autoregressive generation. We demonstrate that concentrating adversarial perturbations on these high-entropy positions achieves comparable semantic degradation to global methods while optimizing fewer decoding positions. Additionally, across multiple representative VLMs, such attacks induce not only semantic drift but also a substantial unsafe subset (20-31%) under the current pipeline. Remarkably, since such vulnerable high-entropy tokens recur across architecturally diverse VLMs, attacks focused on them exhibit non-trivial transferability. Motivated by these findings, we design a simple Entropy-Guided Attack (EGA) that operationalizes sparse high-entropy targeting and extends it with a reusable token bank, yielding competitive attack success rates (93-95%) with a considerable harmful rate (30.2-38.6%) on the three representative open-source VLMs.

representative citing papers

Break the Brake, Not the Wheel: Untargeted Jailbreak via Entropy Maximization

cs.CV · 2026-05-11 · unverdicted · novelty 6.0

UJEM-KL improves cross-model transferability of untargeted jailbreaks on vision-language models by maximizing entropy at decision tokens instead of forcing specific outputs.

citing papers explorer

Showing 1 of 1 citing paper.

Break the Brake, Not the Wheel: Untargeted Jailbreak via Entropy Maximization cs.CV · 2026-05-11 · unverdicted · none · ref 9 · internal anchor
UJEM-KL improves cross-model transferability of untargeted jailbreaks on vision-language models by maximizing entropy at decision tokens instead of forcing specific outputs.

High-Entropy Tokens as Multimodal Failure Points in Vision-Language Models

fields

years

verdicts

representative citing papers

citing papers explorer