citation dossier

Prompt infection: Llm-to-llm prompt injection within multi-agent systems

D · 2024 · arXiv 2410.07283

17Pith papers citing it

17reference links

cs.CRtop field · 11 papers

UNVERDICTEDtop verdict bucket · 16 papers

This arXiv-backed work is queued for full Pith review when it crosses the high-inbound sweep. That review runs reader · skeptic · desk-editor · referee · rebuttal · circularity · lean confirmation · RS check · pith extraction.

read on arXiv PDF

why this work matters in Pith

Pith has found this work in 17 reviewed papers. Its strongest current cluster is cs.CR (11 papers). The largest review-status bucket among citing papers is UNVERDICTED (16 papers). For highly cited works, this page shows a dossier first and a bounded explorer second; it never tries to render every citing paper at once.

representative citing papers

Towards Secure Agent Skills: Architecture, Threat Taxonomy, and Security Analysis

cs.CR · 2026-04-03 · accept · novelty 8.0

Agent Skills has structural security weaknesses from missing data-instruction boundaries, single-approval persistent trust, and absent marketplace reviews that require fundamental redesign.

Attacks and Mitigations for Distributed Governance of Agentic AI under Byzantine Adversaries

cs.CR · 2026-05-12 · unverdicted · novelty 7.0

Identifies concrete attacks from a malicious Provider on SAGA and proposes SAGA-BFT, SAGA-MON, SAGA-AUD, and SAGA-HYB mitigations offering different security-performance trade-offs.

FlowSteer: Prompt-Only Workflow Steering Exposes Planning-Time Vulnerabilities in Multi-Agent LLM Systems

cs.CR · 2026-05-12 · unverdicted · novelty 7.0

FlowSteer is a prompt-only attack that biases multi-agent LLM workflow planning to propagate malicious signals, raising success rates by up to 55%, with FlowGuard as an input-side defense reducing it by up to 34%.

The Granularity Mismatch in Agent Security: Argument-Level Provenance Solves Enforcement and Isolates the LLM Reasoning Bottleneck

cs.CR · 2026-05-11 · unverdicted · novelty 7.0

PACT achieves perfect security and utility under oracle provenance by enforcing argument-level trust contracts based on semantic roles and cross-step provenance tracking, outperforming invocation-level monitors in AgentDojo evaluations.

EquiMem: Calibrating Shared Memory in Multi-Agent Debate via Game-Theoretic Equilibrium

cs.AI · 2026-05-10 · unverdicted · novelty 7.0

EquiMem calibrates shared memory in multi-agent debate by computing a game-theoretic equilibrium from agent queries and paths, outperforming heuristics and LLM validators across benchmarks while remaining robust to adversarial agents.

Autonomous LLM Agent Worms: Cross-Platform Propagation, Automated Discovery and Temporal Re-Entry Defense

cs.CR · 2026-05-04 · unverdicted · novelty 7.0

Autonomous LLM agents can host self-propagating worms via persistent state re-entry, demonstrated with automated analysis tools and blocked by a formal no-propagation defense on three frameworks.

When Child Inherits: Modeling and Exploiting Subagent Spawn in Multi-Agent Networks

cs.CR · 2026-05-08 · unverdicted · novelty 6.0

Multi-agent LLM frameworks can spread compromises across agent boundaries via insecure memory inheritance during subagent spawning.

MAGIQ: A Post-Quantum Multi-Agentic AI Governance System with Provable Security

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

MAGIQ introduces a post-quantum secure system for policy definition, enforcement, and accountability in multi-agent AI using novel cryptographic protocols and UC framework proofs.

ARGUS: Defending LLM Agents Against Context-Aware Prompt Injection

cs.CR · 2026-05-05 · unverdicted · novelty 6.0

ARGUS defends LLM agents from context-aware prompt injections by tracking information provenance and verifying decisions against trustworthy evidence, reducing attack success to 3.8% while retaining 87.5% task utility.

When Embedding-Based Defenses Fail: Rethinking Safety in LLM-Based Multi-Agent Systems

cs.CR · 2026-05-01 · unverdicted · novelty 6.0

Embedding-based defenses fail against attacks that align malicious message embeddings with benign ones in LLM multi-agent systems, but token-level confidence scores improve robustness by enabling better pruning of suspicious messages.

HDP: A Lightweight Cryptographic Protocol for Human Delegation Provenance in Agentic AI Systems

cs.CR · 2026-04-06 · unverdicted · novelty 6.0

HDP is a lightweight protocol that binds human authorization to sessions via signed append-only token chains, enabling offline verification of delegation provenance using only an Ed25519 public key and session identifier.

Safe Multi-Agent Behavior Must Be Maintained, Not Merely Asserted: Constraint Drift in LLM-Based Multi-Agent Systems

cs.MA · 2026-05-11 · unverdicted · novelty 5.0

Safety constraints in LLM-based multi-agent systems commonly weaken during execution through memory, communication, and tool use, requiring them to be maintained as explicit state rather than asserted once.

Insider Attacks in Multi-Agent LLM Consensus Systems

cs.MA · 2026-05-08 · unverdicted · novelty 5.0

A malicious agent in multi-agent LLM consensus systems can be trained via a surrogate world model and RL to reduce consensus rates and prolong disagreement more effectively than direct prompt attacks.

A Low-Latency Fraud Detection Layer for Detecting Adversarial Interaction Patterns in LLM-Powered Agents

cs.AI · 2026-05-01 · unverdicted · novelty 5.0

Researchers developed a fast XGBoost-based detector using 42 runtime features to spot adversarial interaction patterns in LLM agents, running over 9 times faster than LLM detectors on synthetic multi-turn data.

SoK: Security of Autonomous LLM Agents in Agentic Commerce

cs.CR · 2026-04-15 · unverdicted · novelty 5.0

The paper systematizes security for LLM agents in agentic commerce into five threat dimensions, identifies 12 cross-layer attack vectors, and proposes a layered defense architecture.

Securing Computer-Use Agents: A Unified Architecture-Lifecycle Framework for Deployment-Grounded Reliability

cs.CL · 2026-05-08 · unverdicted · novelty 4.0

The paper develops a unified framework that organizes computer-use agent reliability around perception-decision-execution layers and creation-deployment-operation-maintenance stages to map security and alignment interventions.

CASCADE: A Cascaded Hybrid Defense Architecture for Prompt Injection Detection in MCP-Based Systems

cs.CR · 2026-04-18 · unverdicted · novelty 4.0

CASCADE is a cascaded hybrid detector that combines fast regex/entropy filtering, BGE embeddings with local LLM fallback, and output pattern checks to achieve 95.85% precision and 6.06% false-positive rate against prompt injection and related attacks in MCP-based systems.

citing papers explorer

Showing 17 of 17 citing papers.

Towards Secure Agent Skills: Architecture, Threat Taxonomy, and Security Analysis cs.CR · 2026-04-03 · accept · none · ref 30
Agent Skills has structural security weaknesses from missing data-instruction boundaries, single-approval persistent trust, and absent marketplace reviews that require fundamental redesign.
Attacks and Mitigations for Distributed Governance of Agentic AI under Byzantine Adversaries cs.CR · 2026-05-12 · unverdicted · none · ref 35
Identifies concrete attacks from a malicious Provider on SAGA and proposes SAGA-BFT, SAGA-MON, SAGA-AUD, and SAGA-HYB mitigations offering different security-performance trade-offs.
FlowSteer: Prompt-Only Workflow Steering Exposes Planning-Time Vulnerabilities in Multi-Agent LLM Systems cs.CR · 2026-05-12 · unverdicted · none · ref 31
FlowSteer is a prompt-only attack that biases multi-agent LLM workflow planning to propagate malicious signals, raising success rates by up to 55%, with FlowGuard as an input-side defense reducing it by up to 34%.
The Granularity Mismatch in Agent Security: Argument-Level Provenance Solves Enforcement and Isolates the LLM Reasoning Bottleneck cs.CR · 2026-05-11 · unverdicted · none · ref 12
PACT achieves perfect security and utility under oracle provenance by enforcing argument-level trust contracts based on semantic roles and cross-step provenance tracking, outperforming invocation-level monitors in AgentDojo evaluations.
EquiMem: Calibrating Shared Memory in Multi-Agent Debate via Game-Theoretic Equilibrium cs.AI · 2026-05-10 · unverdicted · none · ref 31
EquiMem calibrates shared memory in multi-agent debate by computing a game-theoretic equilibrium from agent queries and paths, outperforming heuristics and LLM validators across benchmarks while remaining robust to adversarial agents.
Autonomous LLM Agent Worms: Cross-Platform Propagation, Automated Discovery and Temporal Re-Entry Defense cs.CR · 2026-05-04 · unverdicted · none · ref 5
Autonomous LLM agents can host self-propagating worms via persistent state re-entry, demonstrated with automated analysis tools and blocked by a formal no-propagation defense on three frameworks.
When Child Inherits: Modeling and Exploiting Subagent Spawn in Multi-Agent Networks cs.CR · 2026-05-08 · unverdicted · none · ref 48
Multi-agent LLM frameworks can spread compromises across agent boundaries via insecure memory inheritance during subagent spawning.
MAGIQ: A Post-Quantum Multi-Agentic AI Governance System with Provable Security cs.LG · 2026-05-07 · unverdicted · none · ref 38
MAGIQ introduces a post-quantum secure system for policy definition, enforcement, and accountability in multi-agent AI using novel cryptographic protocols and UC framework proofs.
ARGUS: Defending LLM Agents Against Context-Aware Prompt Injection cs.CR · 2026-05-05 · unverdicted · none · ref 125
ARGUS defends LLM agents from context-aware prompt injections by tracking information provenance and verifying decisions against trustworthy evidence, reducing attack success to 3.8% while retaining 87.5% task utility.
When Embedding-Based Defenses Fail: Rethinking Safety in LLM-Based Multi-Agent Systems cs.CR · 2026-05-01 · unverdicted · none · ref 18
Embedding-based defenses fail against attacks that align malicious message embeddings with benign ones in LLM multi-agent systems, but token-level confidence scores improve robustness by enabling better pruning of suspicious messages.
HDP: A Lightweight Cryptographic Protocol for Human Delegation Provenance in Agentic AI Systems cs.CR · 2026-04-06 · unverdicted · none · ref 5
HDP is a lightweight protocol that binds human authorization to sessions via signed append-only token chains, enabling offline verification of delegation provenance using only an Ed25519 public key and session identifier.
Safe Multi-Agent Behavior Must Be Maintained, Not Merely Asserted: Constraint Drift in LLM-Based Multi-Agent Systems cs.MA · 2026-05-11 · unverdicted · none · ref 21
Safety constraints in LLM-based multi-agent systems commonly weaken during execution through memory, communication, and tool use, requiring them to be maintained as explicit state rather than asserted once.
Insider Attacks in Multi-Agent LLM Consensus Systems cs.MA · 2026-05-08 · unverdicted · none · ref 51
A malicious agent in multi-agent LLM consensus systems can be trained via a surrogate world model and RL to reduce consensus rates and prolong disagreement more effectively than direct prompt attacks.
A Low-Latency Fraud Detection Layer for Detecting Adversarial Interaction Patterns in LLM-Powered Agents cs.AI · 2026-05-01 · unverdicted · none · ref 28
Researchers developed a fast XGBoost-based detector using 42 runtime features to spot adversarial interaction patterns in LLM agents, running over 9 times faster than LLM detectors on synthetic multi-turn data.
SoK: Security of Autonomous LLM Agents in Agentic Commerce cs.CR · 2026-04-15 · unverdicted · none · ref 119
The paper systematizes security for LLM agents in agentic commerce into five threat dimensions, identifies 12 cross-layer attack vectors, and proposes a layered defense architecture.
Securing Computer-Use Agents: A Unified Architecture-Lifecycle Framework for Deployment-Grounded Reliability cs.CL · 2026-05-08 · unverdicted · none · ref 179
The paper develops a unified framework that organizes computer-use agent reliability around perception-decision-execution layers and creation-deployment-operation-maintenance stages to map security and alignment interventions.
CASCADE: A Cascaded Hybrid Defense Architecture for Prompt Injection Detection in MCP-Based Systems cs.CR · 2026-04-18 · unverdicted · none · ref 12
CASCADE is a cascaded hybrid detector that combines fast regex/entropy filtering, BGE embeddings with local LLM fallback, and output pattern checks to achieve 95.85% precision and 6.06% false-positive rate against prompt injection and related attacks in MCP-based systems.

Prompt infection: Llm-to-llm prompt injection within multi-agent systems

why this work matters in Pith

fields

years

verdicts

representative citing papers

citing papers explorer