The paper defines causality laundering as an attack leaking information from denial outcomes in LLM tool calls and proposes the Agentic Reference Monitor to block it using denial-aware provenance graphs.
Openclaw prism: A zero-fork, defense-in-depth runtime security layer for tool-augmented llm agents
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 7verdicts
UNVERDICTED 7roles
background 4representative citing papers
LivePI benchmark reports indirect prompt injection success rates of 10.7-29.6% across five models on seven input surfaces and shows a two-layer defense blocking all malicious completions while preserving utility.
Routine user chats can unintentionally poison the long-term state of personalized LLM agents, causing authorization drift, tool escalation, and unchecked autonomy, as measured by a new benchmark and reduced by the StateGuard defense.
No existing AI security framework covers a majority of the 193 identified multi-agent system threats in any category, with OWASP Agentic Security Initiative achieving the highest overall coverage at 65.3%.
BraveGuard trains guard models on realistic agent trajectories derived from open-world threats, raising detection accuracy on AgentHazard from 38.79% to 82.38%.
A synthesis of 247 papers on LLM agent security identifies prompt injection and tool hijacking as dominant threats, notes weakly compositional defenses, and argues for trust boundaries and realistic evaluations.
A survey that categorizes threats to OpenClaw agents including skill poisoning and cognitive manipulation and reviews defense mechanisms.
citing papers explorer
-
Causality Laundering: Denial-Feedback Leakage in Tool-Calling LLM Agents
The paper defines causality laundering as an attack leaking information from denial outcomes in LLM tool calls and proposes the Agentic Reference Monitor to block it using denial-aware provenance graphs.
-
LivePI: More Realistic Benchmarking of Agents Against Indirect Prompt Injection
LivePI benchmark reports indirect prompt injection success rates of 10.7-29.6% across five models on seven input surfaces and shows a two-layer defense blocking all malicious completions while preserving utility.
-
When Routine Chats Turn Toxic: Unintended Long-Term State Poisoning in Personalized Agents
Routine user chats can unintentionally poison the long-term state of personalized LLM agents, causing authorization drift, tool escalation, and unchecked autonomy, as measured by a new benchmark and reduced by the StateGuard defense.
-
Security Considerations for Multi-agent Systems
No existing AI security framework covers a majority of the 193 identified multi-agent system threats in any category, with OWASP Agentic Security Initiative achieving the highest overall coverage at 65.3%.
-
BraveGuard: From Open-World Threats to Safer Computer-Use Agents
BraveGuard trains guard models on realistic agent trajectories derived from open-world threats, raising detection accuracy on AgentHazard from 38.79% to 82.38%.
-
Toward Secure LLM Agents: Threat Surfaces, Attacks, Defenses, and Evaluation
A synthesis of 247 papers on LLM agent security identifies prompt injection and tool hijacking as dominant threats, notes weakly compositional defenses, and argues for trust boundaries and realistic evaluations.
-
Security of OpenClaw Agents: Fundamentals, Attacks, and Countermeasures
A survey that categorizes threats to OpenClaw agents including skill poisoning and cognitive manipulation and reviews defense mechanisms.