InjecAgent benchmark demonstrates that tool-integrated LLM agents are vulnerable to indirect prompt injection attacks, with ReAct-prompted GPT-4 succeeding on 24% of attacks and nearly twice that rate when attacker instructions are reinforced.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2024 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents
InjecAgent benchmark demonstrates that tool-integrated LLM agents are vulnerable to indirect prompt injection attacks, with ReAct-prompted GPT-4 succeeding on 24% of attacks and nearly twice that rate when attacker instructions are reinforced.