Recognition: no theorem link
When Child Inherits: Modeling and Exploiting Subagent Spawn in Multi-Agent Networks
Pith reviewed 2026-05-12 01:17 UTC · model grok-4.3
The pith
Subagent inheritance allows compromised LLM agents to spread malicious instructions across multi-agent networks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In multi-agent LLM networks, subagent spawn operates as an inheritance channel that can breach trust boundaries. Current implementations allow malicious content in a parent's memory to be passed to children, weak controls on resources, persistence of stale data post-spawn, and improper authority over termination. The paper demonstrates these issues in practical frameworks and argues for introducing explicit security invariants to govern the spawn process.
What carries the argument
The subagent inheritance model, which treats spawn as the transfer of memory, resources, state, and termination authority from parent to child agents.
Load-bearing premise
The specific inheritance behaviors seen in the studied agent frameworks are typical of current multi-agent networks, and adding security invariants will fix the problems without creating fresh vulnerabilities.
What would settle it
A test where a parent agent is injected with a specific malicious instruction and then spawns a child, checking if the child exhibits the injected behavior without re-prompting.
Figures
read the original abstract
Since the official release of ChatGPT in 2022, large language models (LLMs) have rapidly evolved from chatbot-style interfaces into agentic systems that can delegate work through tools and newly spawned subagents. While these capabilities improve automation and scalability, they also pose new security risks in multi-agent networks. Existing research has studied how individual LLM-based agents can be compromised through prompt injection, jailbreaking, poisoned retrieval data, or malicious extensions. Less is known about what happens after one agent is compromised inside a multi-agent network. In particular, inherited memory from parent agents can carry malicious instructions, outdated states, or unintended behavioral rules into newly created subagents, allowing a local compromise to spread across agent boundaries. In this paper, we model contemporary multi-agent networks through the lens of subagent inheritance. Our analysis shows that current frameworks can violate trust boundaries through insecure memory inheritance, weak resource control, stale post-spawn state, and improper termination authority. We demonstrate these risks in real agent frameworks and propose defenses based on explicit security invariants. Our findings show that inheritance is not merely an implementation detail, but a central component influencing the security of multi-agent systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper models subagent spawn and inheritance mechanisms in multi-agent LLM networks. It identifies four trust-boundary violations—insecure memory inheritance, weak resource control, stale post-spawn state, and improper termination authority—demonstrates them in real frameworks, and proposes explicit security invariants as mitigations, arguing that inheritance is a central security factor rather than an implementation detail.
Significance. If the modeling of inheritance behaviors is accurate and the invariants can be shown to block the described attacks without side effects, the work would highlight an important propagation risk in agentic systems that has received less attention than single-agent prompt injection. Concrete demonstrations in existing frameworks add practical value, and the focus on invariants could inform more principled designs for multi-agent security.
major comments (2)
- The central claim that the proposed security invariants address the four identified risks without introducing new vulnerabilities (e.g., overly restrictive controls breaking legitimate delegation or new timing channels) is load-bearing but unsupported. The manuscript transitions from observed violations to proposed defenses without formal verification, completeness arguments, or re-testing of the original attack vectors under the invariants.
- Demonstrations of the four violation types in real frameworks are described at a high level in the abstract and analysis sections, but lack sufficient detail on the specific frameworks examined, the exact inheritance APIs or memory models exploited, and quantitative outcomes. This weakens the generality claim that current frameworks systematically violate trust boundaries.
minor comments (1)
- The abstract and introduction could more explicitly name the frameworks used for demonstrations and the precise security invariants (e.g., by listing them or referencing a table/definition).
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed review. The comments highlight important areas where the manuscript can be strengthened, particularly around supporting the effectiveness of the proposed invariants and providing more concrete details on the demonstrations. We address each major comment below and describe the revisions we will make.
read point-by-point responses
-
Referee: The central claim that the proposed security invariants address the four identified risks without introducing new vulnerabilities (e.g., overly restrictive controls breaking legitimate delegation or new timing channels) is load-bearing but unsupported. The manuscript transitions from observed violations to proposed defenses without formal verification, completeness arguments, or re-testing of the original attack vectors under the invariants.
Authors: We agree that the current presentation of the invariants would benefit from stronger supporting arguments. In the revised manuscript we will add a dedicated subsection that provides informal completeness arguments for each invariant, mapping them explicitly to the four violation types and explaining the mechanisms by which they prevent propagation. We will also include a short discussion of potential side effects (e.g., restrictions on delegation patterns or introduction of new timing channels) and argue, based on the threat model, that these can be avoided with careful implementation. In addition, we will re-execute the attack vectors from at least one of the evaluated frameworks after applying the invariants and report the outcomes. While we do not add a full formal verification (which would require a different methodological scope), these additions will make the load-bearing claim substantially better supported. revision: partial
-
Referee: Demonstrations of the four violation types in real frameworks are described at a high level in the abstract and analysis sections, but lack sufficient detail on the specific frameworks examined, the exact inheritance APIs or memory models exploited, and quantitative outcomes. This weakens the generality claim that current frameworks systematically violate trust boundaries.
Authors: We accept that the current level of detail limits the strength of the generality claim. In the revision we will expand the evaluation section with a new table and accompanying text that names the concrete frameworks examined, describes the precise subagent-spawn and memory-inheritance APIs used, outlines the memory models involved, and reports quantitative results (attack success rates, state-propagation latency, and resource-consumption metrics before and after the proposed mitigations). These additions will make the demonstrations reproducible and will directly support the claim that the violations are systematic rather than anecdotal. revision: yes
Circularity Check
No circularity; modeling rests on external frameworks and observations
full rationale
The paper models subagent inheritance risks by examining real agent frameworks, identifies specific violations such as insecure memory inheritance, and proposes security invariants as defenses. No equations, fitted parameters, or derivations are present that reduce by construction to the paper's own inputs. Claims rely on external demonstrations rather than self-definitional steps or load-bearing self-citations. The argument is self-contained against benchmarks of existing multi-agent systems.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Contemporary multi-agent LLM frameworks pass memory, state, and behavioral rules from parent agents to spawned subagents.
Reference graph
Works this paper leans on
-
[1]
A survey of agentic ai and cybersecurity: Challenges, opportunities and use-case prototypes,
S. J. Lazer, K. Aryal, M. Gupta, and E. Bertino, “A survey of agentic ai and cybersecurity: Challenges, opportunities and use-case prototypes,”
-
[2]
Available: https://arxiv.org/abs/2601.05293
[Online]. Available: https://arxiv.org/abs/2601.05293
-
[3]
The path ahead for agentic ai: Challenges and opportunities,
N. Sibai, Y . Ahmed, S. Sibaee, S. AlHalawani, A. Ammar, and W. Boulila, “The path ahead for agentic ai: Challenges and opportunities,” 2026. [Online]. Available: https://arxiv.org/abs/2601. 02749
work page 2026
-
[4]
N. Shapira, C. Wendler, A. Yen, G. Sarti, K. Pal, O. Floody, A. Belfki, A. Loftus, A. R. Jannali, N. Prakash, J. Cui, G. Rogers, J. Brinkmann, C. Rager, A. Zur, M. Ripa, A. Sankaranarayanan, D. Atkinson, R. Gandikota, J. Fiotto-Kaufman, E. Hwang, H. Orgad, P. S. Sahil, N. Taglicht, T. Shabtay, A. Ambus, N. Alon, S. Oron, A. Gordon-Tapiero, Y . Kaplan, V ....
work page internal anchor Pith review arXiv 2026
-
[5]
Openclaw cve & security advisory tracker,
J. Gamblin, “Openclaw cve & security advisory tracker,” 2026. [Online]. Available: https://github.com/jgamblin/OpenClawCVEs/
work page 2026
- [6]
-
[7]
A safety and security framework for real-world agentic systems,
S. Ghosh, B. Simkin, K. Shiarlis, S. Nandi, D. Zhao, M. Fiedler, J. Bazinska, N. Pope, R. Prabhu, D. Rohreret al., “A safety and security framework for real-world agentic systems,” 2025. [Online]. Available: https://arxiv.org/abs/2511.21990
-
[8]
Agentic misalignment: How llms could be insider threats,
A. Lynch, B. Wright, C. Larson, S. J. Ritchie, S. Mindermann, E. Hubinger, E. Perez, and K. Troy, “Agentic misalignment: How llms could be insider threats,” 2025. [Online]. Available: https://arxiv.org/abs/2510.05179
-
[9]
Model context protocol (mcp): Landscape, security threats, and future research directions,
X. Hou, Y . Zhao, S. Wang, and H. Wang, “Model context protocol (mcp): Landscape, security threats, and future research directions,”
-
[10]
Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions
[Online]. Available: https://arxiv.org/abs/2503.23278
work page internal anchor Pith review arXiv
-
[11]
H. Song, Y . Shen, W. Luo, L. Guo, T. Chen, J. Wang, B. Li, X. Zhang, and J. Chen, “Beyond the protocol: Unveiling attack vectors in the model context protocol (mcp) ecosystem,” 2025. [Online]. Available: https://arxiv.org/abs/2506.02040
-
[12]
Y . Louck, A. Stulman, and A. Dvir, “Improving google a2a protocol: Protecting sensitive data and mitigating unintended harms in multi-agent systems,” 2025. [Online]. Available: https://arxiv.org/abs/2505.12490
-
[13]
Security and privacy challenges of large language models: A survey,
B. C. Das, M. H. Amini, and Y . Wu, “Security and privacy challenges of large language models: A survey,”ACM Computing Surveys, vol. 57, no. 6, pp. 1–39, 2025
work page 2025
-
[14]
W. Zou, R. Geng, B. Wang, and J. Jia, “Poisonedrag: Knowledge corruption attacks to retrieval-augmented generation of large language models,” 2024. [Online]. Available: https://arxiv.org/abs/2402.07867
-
[15]
D. Schmotz, L. Beurer-Kellner, S. Abdelnabi, and M. Andriushchenko, “Skill-inject: Measuring agent vulnerability to skill file attacks,” 2026. [Online]. Available: https://arxiv.org/abs/2602.20156
-
[16]
How we built our multi-agent research system,
Anthropic, “How we built our multi-agent research system,” Jun 2025, accessed: 2026-03-16. [Online]. Available: https://www.anthropic.com/ engineering/multi-agent-research-system
work page 2025
-
[17]
S. Raza, R. Sapkota, M. Karkee, and C. Emmanouilidis, “Trism for agentic ai: A review of trust, risk, and security management in llm-based agentic multi-agent systems,” 2025. [Online]. Available: https://arxiv.org/abs/2506.04133
-
[18]
OpenClaw, “Openclaw documentation,” 2026, accessed: 2026-03-23. [Online]. Available: https://docs.openclaw.ai
work page 2026
-
[19]
Owasp top 10 for large language model applications,
OW ASP Foundation, “Owasp top 10 for large language model applications,” 2025. [Online]. Available: https://owasp.org/www- project-top-10-for-large-language-model-applications/
work page 2025
-
[20]
Security and privacy in llms: A comprehensive survey of threats and mitigation strategies,
A. D. E. Berini, N. Jamil, A.-E. Benrazek, A. Lakas, L. Ismail, M. A. Ferrag, and K.-Y . Lam, “Security and privacy in llms: A comprehensive survey of threats and mitigation strategies,” Information Fusion, vol. 132, p. 104241, 2026. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S156625352600120X
work page 2026
-
[21]
A survey on large language model based autonomous agents,
L. Wang, C. Ma, X. Feng, Z. Zhang, H. Yang, J. Zhang, Z. Chen, J. Tang, X. Chen, Y . Linet al., “A survey on large language model based autonomous agents,”Frontiers of Computer Science, vol. 18, no. 6, p. 186345, 2024
work page 2024
-
[22]
The rise and potential of large language model based agents: A survey,
Z. Xi, W. Chen, X. Guo, W. He, Y . Ding, B. Hong, M. Zhang, J. Wang, S. Jin, E. Zhou, R. Zheng, X. Fan, X. Wang, L. Xiong, Y . Zhou, W. Wang, C. Jiang, Y . Zou, X. Liu, Z. Yin, S. Dou, R. Weng, W. Cheng, Q. Zhang, W. Qin, Y . Zheng, X. Qiu, X. Huang, and T. Gui, “The rise and potential of large language model based agents: A survey,” 2023
work page 2023
-
[23]
ReAct: Synergizing reasoning and acting in language models,
S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y . Cao, “ReAct: Synergizing reasoning and acting in language models,” in International Conference on Learning Representations (ICLR), 2023
work page 2023
-
[24]
Re- flexion: Language agents with verbal reinforcement learning,
N. Shinn, F. Cassano, A. Gopinath, K. Narasimhan, and S. Yao, “Re- flexion: Language agents with verbal reinforcement learning,”Advances in neural information processing systems, vol. 36, pp. 8634–8652, 2023
work page 2023
-
[25]
Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face,
Y . Shen, K. Song, X. Tan, D. Li, W. Lu, and Y . Zhuang, “Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face,”Advances in Neural Information Processing Systems, vol. 36, pp. 38 154–38 180, 2023
work page 2023
-
[26]
Toolformer: Language models can teach themselves to use tools,
T. Schick, J. Dwivedi-Yu, R. Dess `ı, R. Raileanu, M. Lomeli, E. Hambro, L. Zettlemoyer, N. Cancedda, and T. Scialom, “Toolformer: Language models can teach themselves to use tools,”Advances in neural informa- tion processing systems, vol. 36, pp. 68 539–68 551, 2023
work page 2023
-
[27]
Generative agents: Interactive simulacra of human behavior,
J. S. Park, J. O’Brien, C. J. Cai, M. R. Morris, P. Liang, and M. S. Bernstein, “Generative agents: Interactive simulacra of human behavior,” inProceedings of the 36th annual acm symposium on user interface software and technology. Association for Computing Machinery, 2023, pp. 1–22
work page 2023
-
[28]
How openclaw works: Understanding ai agents through a real architecture,
B. Poudel, “How openclaw works: Understanding ai agents through a real architecture,” Feb 2026. [Online]. Available: https://bibek-poudel. medium.com/how-openclaw-works-understanding-ai-agents-through-a- real-architecture-5d59cc7a4764
work page 2026
-
[29]
Openclaw architecture, explained: How it works,
P. Perazzo, “Openclaw architecture, explained: How it works,” Feb
-
[30]
Available: https://ppaolo.substack.com/p/openclaw- system-architecture-overview#
[Online]. Available: https://ppaolo.substack.com/p/openclaw- system-architecture-overview#
-
[31]
Agent zero: A personal, organic agentic framework that grows and learns with you,
agent0ai, “Agent zero: A personal, organic agentic framework that grows and learns with you,” https://github.com/agent0ai/agent- zero, 2026, gitHub repository
work page 2026
-
[32]
Hermes agent: The agent that grows with you,
Nous Research, “Hermes agent: The agent that grows with you,” https: //github.com/NousResearch/hermes-agent, 2026, gitHub repository
work page 2026
-
[33]
Available: https://arxiv.org/abs/2504.03111
Z. Li, J. Cui, X. Liao, and L. Xing, “Les dissonances: Cross-tool harvesting and polluting in pool-of-tools empowered llm agents,” 2025. [Online]. Available: https://arxiv.org/abs/2504.03111
-
[34]
V oltAgent, “V oltagent,” 2026, accessed: 2026-03-18. [Online]. Available: https://voltagent.dev/
work page 2026
-
[35]
G. Cloud, “Vertex ai agent builder,” 2026, accessed: 2026-03-18. [Online]. Available: https://cloud.google.com/products/agent-builder
work page 2026
-
[36]
J. Kang, M. Ji, Z. Zhao, and T. Bai, “Memory os of ai agent,” 2025. [Online]. Available: https://arxiv.org/abs/2506.06326
-
[37]
Guide to attribute based access control (abac) definition and considerations (draft),
V . C. Hu, D. Ferraiolo, R. Kuhn, A. R. Friedman, A. J. Lang, M. M. Cogdell, A. Schnitzer, K. Sandlin, R. Miller, K. Scarfoneet al., “Guide to attribute based access control (abac) definition and considerations (draft),”NIST special publication, vol. 800, no. 162, pp. 1–54, 2013
work page 2013
- [38]
-
[39]
X. Wang, Z. Ji, W. Wang, Z. Li, D. Wu, and S. Wang, “Sok: Evaluating jailbreak guardrails for large language models,” 2025. [Online]. Available: https://arxiv.org/abs/2506.10597 14
-
[40]
A new era in llm security: Exploring security concerns in real-world llm-based systems,
F. Wu, N. Zhang, S. Jha, P. McDaniel, and C. Xiao, “A new era in llm security: Exploring security concerns in real-world llm-based systems,” arXiv preprint arXiv:2402.18649, 2024
-
[41]
The protection of information in computer systems,
J. H. Saltzer and M. D. Schroeder, “The protection of information in computer systems,”Proceedings of the IEEE, vol. 63, no. 9, pp. 1278– 1308, 1975
work page 1975
-
[42]
Prompt Injection attack against LLM-integrated Applications
Y . Liu, G. Deng, Y . Li, K. Wang, Z. Wang, X. Wang, T. Zhang, Y . Liu, H. Wang, Y . Zhenget al., “Prompt injection attack against llm-integrated applications,”arXiv preprint arXiv:2306.05499, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[43]
K. Greshake, S. Abdelnabi, S. Mishra, C. Endres, T. Holz, and M. Fritz, “Not what you’ve signed up for: Compromising real-world llm- integrated applications with indirect prompt injection,” inProceedings of the 16th ACM workshop on artificial intelligence and security, 2023, pp. 79–90
work page 2023
-
[44]
Benchmarking and defending against indirect prompt injection attacks on large language models,
J. Yi, Y . Xie, B. Zhu, E. Kiciman, G. Sun, X. Xie, and F. Wu, “Benchmarking and defending against indirect prompt injection attacks on large language models,” inProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V . 1, 2025, pp. 1809–1820
work page 2025
-
[45]
Q. Zhan, Z. Liang, Z. Ying, and D. Kang, “Injecagent: Benchmark- ing indirect prompt injections in tool-integrated large language model agents,” inFindings of the Association for Computational Linguistics: ACL 2024, 2024, pp. 10 471–10 506
work page 2024
-
[46]
Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for llm agents,
E. Debenedetti, J. Zhang, M. Balunovic, L. Beurer-Kellner, M. Fischer, and F. Tram`er, “Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for llm agents,”Advances in Neural Information Processing Systems, vol. 37, pp. 82 895–82 920, 2024
work page 2024
-
[47]
Evil Geniuses : Delving into the Safety of LLM -based Agents , February 2024
Y . Tian, X. Yang, J. Zhang, Y . Dong, and H. Su, “Evil geniuses: Delving into the safety of llm-based agents,”arXiv preprint arXiv:2311.11855, 2023
-
[48]
Prompt Infection: LLM-to-LLM Prompt Injection within Multi-Agent Systems
D. Lee and M. Tiwari, “Prompt infection: Llm-to-llm prompt injection within multi-agent systems,”arXiv preprint arXiv:2410.07283, 2024
work page internal anchor Pith review arXiv 2024
-
[49]
The Dark Side of LLMs: Agent-based Attack Vectors for System-level Compromise
M. Lupinacci, F. A. Pironti, F. Blefari, F. Romeo, L. Arena, and A. Furfaro, “The dark side of llms: Agent-based attacks for complete computer takeover,”arXiv preprint arXiv:2507.06850, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[50]
Multi-agent systems execute arbitrary malicious code
H. Triedman, R. Jha, and V . Shmatikov, “Multi-agent systems execute arbitrary malicious code,”arXiv preprint arXiv:2503.12188, 2025
-
[51]
The confused deputy: (or why capabilities might have been invented),
N. Hardy, “The confused deputy: (or why capabilities might have been invented),”ACM SIGOPS Operating Systems Review, vol. 22, no. 4, pp. 36–38, 1988
work page 1988
-
[52]
Netsafe: Exploring the topological safety of multi-agent networks,
M. Yu, S. Wang, G. Zhang, J. Mao, C. Yin, Q. Liu, Q. Wen, K. Wang, and Y . Wang, “Netsafe: Exploring the topological safety of multi-agent networks,”arXiv preprint arXiv:2410.15686, 2024
-
[53]
Red-teaming llm multi-agent systems via communication attacks,
P. He, Y . Lin, S. Dong, H. Xu, Y . Xing, and H. Liu, “Red-teaming llm multi-agent systems via communication attacks,” inFindings of the Association for Computational Linguistics: ACL 2025, 2025, pp. 6726– 6747
work page 2025
-
[54]
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions
E. Wallace, K. Xiao, R. Leike, L. Weng, J. Heidecke, and A. Beutel, “The instruction hierarchy: Training llms to prioritize privileged instruc- tions,”arXiv preprint arXiv:2404.13208, 2024
work page internal anchor Pith review arXiv 2024
-
[55]
Defeating Prompt Injections by Design
E. Debenedetti, I. Shumailov, T. Fan, J. Hayes, N. Carlini, D. Fabian, C. Kern, C. Shi, A. Terzis, and F. Tram `er, “Defeating prompt injections by design,”arXiv preprint arXiv:2503.18813, 2025
work page internal anchor Pith review arXiv 2025
-
[56]
On optimistic methods for concurrency control,
H.-T. Kung and J. T. Robinson, “On optimistic methods for concurrency control,”ACM Transactions on Database Systems (TODS), vol. 6, no. 2, pp. 213–226, 1981
work page 1981
-
[57]
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
H. Zhang, J. Huang, K. Mei, Y . Yao, Z. Wang, C. Zhan, H. Wang, and Y . Zhang, “Agent security bench (asb): Formalizing and bench- marking attacks and defenses in llm-based agents,”arXiv preprint arXiv:2410.02644, 2024
work page internal anchor Pith review arXiv 2024
-
[58]
AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
M. Andriushchenko, A. Souly, M. Dziemian, D. Duenas, M. Lin, J. Wang, D. Hendrycks, A. Zou, Z. Kolter, M. Fredriksonet al., “Agentharm: A benchmark for measuring harmfulness of llm agents,” arXiv preprint arXiv:2410.09024, 2024. 15
work page internal anchor Pith review arXiv 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.