DMN achieves over 90% attack success rate on GPT-4o, Gemini-2.5-pro and Claude Sonnet 4 by distributing instructions, supplying multimodal evidence, and adding number chain tasks across multiple images.
European Conference on Computer Vision , pages=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
SafeSteer improves safety in multimodal large language models by up to 33.4% via a decoding probe and modal alignment vector without any fine-tuning.
citing papers explorer
-
DMN: A Compositional Framework for Jailbreaking Multimodal LLMs with Multi-Image Inputs
DMN achieves over 90% attack success rate on GPT-4o, Gemini-2.5-pro and Claude Sonnet 4 by distributing instructions, supplying multimodal evidence, and adding number chain tasks across multiple images.
-
SafeSteer: A Decoding-level Defense Mechanism for Multimodal Large Language Models
SafeSteer improves safety in multimodal large language models by up to 33.4% via a decoding probe and modal alignment vector without any fine-tuning.