Agent smith: A single image can jailbreak one million multimodal llm agents exponentially fast

· 2024 · arXiv 2402.08567

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

read on arXiv browse 10 citing papers

citation-role summary

background 4

citation-polarity summary

background 3 support 1

representative citing papers

Confused ChatGPT: Cross-App Context Poisoning via First-Party APIs

cs.CR · 2026-05-30 · unverdicted · novelty 8.0

Identifies cross-app context poisoning in ChatGPT Apps, a persistent indirect prompt injection delivered through undocumented first-party API parameters that lets one app manipulate others via the shared untagged context.

Prompt Infection: LLM-to-LLM Prompt Injection within Multi-Agent Systems

cs.MA · 2024-10-09 · unverdicted · novelty 8.0

Prompt injection attacks can self-replicate across LLM agents in multi-agent systems, enabling data theft, misinformation, and system disruption while propagating silently.

MEMSAD: Gradient-Coupled Anomaly Detection for Memory Poisoning in Retrieval-Augmented Agents

cs.CR · 2026-05-05 · unverdicted · novelty 7.0 · 2 refs

MEMSAD links anomaly detection gradients to retrieval objectives under encoder regularity to certify detection of continuous memory poisons, achieving perfect TPR/FPR in experiments while exposing a synonym-invariance gap.

A Systematic Survey of Security Threats and Defenses in LLM-Based AI Agents: A Layered Attack Surface Framework

cs.CR · 2026-04-25 · unverdicted · novelty 7.0

A new 7x4 taxonomy organizes agentic AI security threats by architectural layer and persistence timescale, revealing under-explored upper layers and missing defenses after surveying 116 papers.

Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization

cs.AI · 2026-02-24 · unverdicted · novelty 7.0

The work creates a new benchmark for humanizing GUI agent touch dynamics via a MinMax detector-agent model, a mobile touch dataset, and methods showing agents can match human behavior without losing task performance.

Every Picture Tells a Dangerous Story: Memory-Augmented Multi-Agent Jailbreak Attacks on VLMs

cs.AI · 2026-04-14 · unverdicted · novelty 6.0

MemJack achieves 71.48% attack success rate on unmodified COCO val2017 images against Qwen3-VL-Plus by coordinating agents to map visual entities to malicious intents, apply multi-angle camouflage, and filter refusals via iterative nullspace projection while transferring strategies through a shared

A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron?

cs.CL · 2025-05-16 · unverdicted · novelty 5.0

A survey that defines Computer-Using Agents for safety analysis, categorizes their threats, proposes a taxonomy of defensive strategies, and summarizes benchmarks and datasets for evaluating CUA safety and performance.

Conversations Risk Detection LLMs in Financial Agents via Multi-Stage Generative Rollout

cs.CR · 2026-04-10 · unverdicted · novelty 4.0

FinSec is a multi-stage detection system for financial LLM dialogues that reaches 90.13% F1 score, cuts attack success rate to 9.09%, and raises AUPRC to 0.9189.

Large Language Model Agent: A Survey on Methodology, Applications and Challenges

cs.CL · 2025-03-27 · accept · novelty 3.0

A survey that deconstructs LLM agent systems via a methodology-centered taxonomy linking design principles to emergent behaviors, applications, and challenges.

LLM-Powered AI Agent Systems and Their Applications in Industry

cs.AI · 2025-05-22 · unverdicted · novelty 2.0

A survey categorizing LLM-powered agent systems into software-based, physical, and hybrid types, covering industrial applications and challenges such as latency and security.

citing papers explorer

Showing 10 of 10 citing papers.

Confused ChatGPT: Cross-App Context Poisoning via First-Party APIs cs.CR · 2026-05-30 · unverdicted · none · ref 16
Identifies cross-app context poisoning in ChatGPT Apps, a persistent indirect prompt injection delivered through undocumented first-party API parameters that lets one app manipulate others via the shared untagged context.
Prompt Infection: LLM-to-LLM Prompt Injection within Multi-Agent Systems cs.MA · 2024-10-09 · unverdicted · none · ref 58
Prompt injection attacks can self-replicate across LLM agents in multi-agent systems, enabling data theft, misinformation, and system disruption while propagating silently.
MEMSAD: Gradient-Coupled Anomaly Detection for Memory Poisoning in Retrieval-Augmented Agents cs.CR · 2026-05-05 · unverdicted · none · ref 7 · 2 links
MEMSAD links anomaly detection gradients to retrieval objectives under encoder regularity to certify detection of continuous memory poisons, achieving perfect TPR/FPR in experiments while exposing a synonym-invariance gap.
A Systematic Survey of Security Threats and Defenses in LLM-Based AI Agents: A Layered Attack Surface Framework cs.CR · 2026-04-25 · unverdicted · none · ref 12
A new 7x4 taxonomy organizes agentic AI security threats by architectural layer and persistence timescale, revealing under-explored upper layers and missing defenses after surveying 116 papers.
Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization cs.AI · 2026-02-24 · unverdicted · none · ref 40
The work creates a new benchmark for humanizing GUI agent touch dynamics via a MinMax detector-agent model, a mobile touch dataset, and methods showing agents can match human behavior without losing task performance.
Every Picture Tells a Dangerous Story: Memory-Augmented Multi-Agent Jailbreak Attacks on VLMs cs.AI · 2026-04-14 · unverdicted · none · ref 11
MemJack achieves 71.48% attack success rate on unmodified COCO val2017 images against Qwen3-VL-Plus by coordinating agents to map visual entities to malicious intents, apply multi-angle camouflage, and filter refusals via iterative nullspace projection while transferring strategies through a shared
A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron? cs.CL · 2025-05-16 · unverdicted · none · ref 3
A survey that defines Computer-Using Agents for safety analysis, categorizes their threats, proposes a taxonomy of defensive strategies, and summarizes benchmarks and datasets for evaluating CUA safety and performance.
Conversations Risk Detection LLMs in Financial Agents via Multi-Stage Generative Rollout cs.CR · 2026-04-10 · unverdicted · none · ref 14
FinSec is a multi-stage detection system for financial LLM dialogues that reaches 90.13% F1 score, cuts attack success rate to 9.09%, and raises AUPRC to 0.9189.
Large Language Model Agent: A Survey on Methodology, Applications and Challenges cs.CL · 2025-03-27 · accept · none · ref 220
A survey that deconstructs LLM agent systems via a methodology-centered taxonomy linking design principles to emergent behaviors, applications, and challenges.
LLM-Powered AI Agent Systems and Their Applications in Industry cs.AI · 2025-05-22 · unverdicted · none · ref 113
A survey categorizing LLM-powered agent systems into software-based, physical, and hybrid types, covering industrial applications and challenges such as latency and security.

Agent smith: A single image can jailbreak one million multimodal llm agents exponentially fast

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer