Lessons from Penetration Tests on Large-Scale Agent Systems
Pith reviewed 2026-06-29 16:36 UTC · model grok-4.3
The pith
Proprietary AI agent systems exhibit the same recurring security weaknesses as open-source agents despite stricter development processes.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Penetration tests conducted in 2025 against two proprietary agent products show that these systems exhibit similar security weaknesses to those observed in prior open-source agent research, indicating that the security posture of AI agents has not substantially improved despite stricter coding standards and formal review processes.
What carries the argument
Penetration testing applied to proprietary agent products to surface cross-layer weaknesses in unbounded, self-modifying execution-capable AI agents.
If this is right
- Developers of execution-capable agents must still reason about and secure complex cross-layer behaviors.
- Recurring vulnerability classes persist across both open-source and proprietary development methodologies.
- The security burden on agent developers remains significant even under formal review processes.
- Prior research on open-source agents provides relevant lessons for proprietary products.
Where Pith is reading between the lines
- Security engineering for agents may need to shift from coding standards toward new architectural constraints on autonomy and tool use.
- Organizations deploying agents at scale could benefit from shared test suites that target the recurring weakness classes identified here.
- If the pattern holds, regulatory or certification requirements for agent systems might focus on observable interaction surfaces rather than internal development processes.
Load-bearing premise
The two proprietary products tested in 2025 are representative of the broader class of large-scale proprietary agent systems.
What would settle it
A penetration test on a third proprietary agent product from 2025 that finds no recurring classes of weaknesses previously documented in open-source agents.
Figures
read the original abstract
As AI systems gain increasing autonomy and execution capability, the number of discovered security vulnerabilities continues to rise. However, many of these vulnerabilities are not fundamentally novel, but instead reflect recurring classes of weaknesses long observed in prior computing systems. Execution-capable AI agents are effectively unbounded, self-modifying programs that interact extensively with multiple layers of the computing stack. This broad interaction surface imposes a significant security burden on developers, who must reason about and secure complex cross-layer behaviors. Prior research has primarily focused on vulnerabilities in open-source agents and agent frameworks. In contrast, it remains unclear whether proprietary agent systems -- developed under stricter coding standards and formal review processes -- exhibit similar security weaknesses. In this paper, we present findings from two penetration tests conducted in 2025 against proprietary agent products and evaluate whether the security posture of AI agents has improved since these assessments.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents findings from two penetration tests conducted in 2025 on proprietary large-scale AI agent products. It claims that, despite development under stricter coding standards and formal review processes, these systems exhibit similar recurring classes of security weaknesses to those previously documented in open-source agents and frameworks, indicating that the security posture of execution-capable AI agents has not meaningfully improved.
Significance. If the two tested systems prove representative, the result would demonstrate that cross-layer interaction surfaces in autonomous agents impose persistent security burdens regardless of development paradigm, reinforcing the need for improved developer reasoning about unbounded, self-modifying behaviors. The use of real-world proprietary targets adds practical relevance beyond prior open-source studies.
major comments (2)
- [Abstract] Abstract: the central claim that proprietary systems 'exhibit similar security weaknesses' to open-source ones is load-bearing on the two 2025 products being representative of the broader class, yet the abstract supplies no selection criteria, architectural comparison to other proprietary agents, population estimate, or discussion of access restrictions or selection bias.
- [Abstract] Abstract: no specific vulnerabilities, methodology details, data, or error analysis are provided, so it is impossible to evaluate whether the observed weaknesses are in fact the same recurring classes or whether the tests support the 'has not improved' conclusion.
minor comments (1)
- [Abstract] The phrase 'since these assessments' is undefined; the abstract does not identify the prior open-source studies or time frame being used as baseline.
Simulated Author's Rebuttal
We thank the referee for their review and for highlighting issues of generalizability and transparency in the abstract. We address each major comment below. We agree that the abstract can be strengthened with additional context on selection and methodology where feasible, but confidentiality constraints on the proprietary targets limit what can be disclosed. We propose partial revisions to the abstract accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that proprietary systems 'exhibit similar security weaknesses' to open-source ones is load-bearing on the two 2025 products being representative of the broader class, yet the abstract supplies no selection criteria, architectural comparison to other proprietary agents, population estimate, or discussion of access restrictions or selection bias.
Authors: The two products were selected because they are large-scale, commercially deployed proprietary agent systems for which authorized penetration testing access was obtained under responsible disclosure agreements. No comprehensive public population estimate of such proprietary systems exists, precluding statistical sampling claims. Architectural comparisons to other agents appear in the introduction and related work of the full manuscript. We will revise the abstract to note the selection basis (large-scale commercial deployments) and access restrictions (NDA-bound responsible testing), which directly addresses the representativeness concern without revealing confidential details. revision: partial
-
Referee: [Abstract] Abstract: no specific vulnerabilities, methodology details, data, or error analysis are provided, so it is impossible to evaluate whether the observed weaknesses are in fact the same recurring classes or whether the tests support the 'has not improved' conclusion.
Authors: The abstract is kept high-level due to length limits and to avoid disclosing vendor-sensitive information. The full manuscript contains dedicated sections on the black-box and gray-box penetration testing methodology, anonymized examples of the recurring weakness classes, direct comparisons to prior open-source findings, and supporting analysis. We will revise the abstract to briefly reference the testing approach and direct readers to the detailed evaluations in the body. Specific vulnerability instances, raw data, and granular error analysis cannot be provided publicly. revision: partial
- Disclosure of specific vulnerabilities, raw test data, or granular error analysis from the proprietary targets, which is precluded by non-disclosure agreements and ongoing remediation processes.
Circularity Check
No circularity: empirical report with direct observations only
full rationale
The paper is a report of penetration test findings on two specific proprietary agent products. It contains no equations, no fitted parameters, no derivations, and no self-citation chains that reduce any central claim to a prior result by construction. The strongest claim (similar weaknesses in proprietary vs. open-source agents) rests on the observed test outcomes rather than any definitional or predictive loop. Generalizability concerns are validity issues, not circularity.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Shared Responsibility Model
Amazon. Shared Responsibility Model. https://aws.amazon.com/ compliance/shared-responsibility-model/, 2025
2025
-
[2]
Trace your LLM application’s runtime using OpenTelemetry-based instrumentation
Arize-ai. Trace your LLM application’s runtime using OpenTelemetry-based instrumentation. https://docs.arize.com/ phoenix/tracing/llm-traces, 2025
2025
-
[3]
Artificial intelligence (AI) shared respon- sibility model
Microsoft Azure. Artificial intelligence (AI) shared respon- sibility model. https://learn.microsoft.com/en-us/azure/security/ fundamentals/shared-responsibility-ai, 2024
2024
-
[4]
Systems security foundations for agentic computing
Mihai Christodorescu, Earlence Fernandes, Ashish Hooda, Somesh Jha, Johann Rehberger, and Khawaja Shams. Systems security foundations for agentic computing. Cryptology ePrint Archive, Paper 2025/2173, 2025
2025
-
[5]
Github flavored markdown spec
GitHub. Github flavored markdown spec. https://github.github.com/ gfm/, 2019
2019
-
[6]
Project Padawan: Advancing Agentic AI in GitHub Copilot
GitHub. Project Padawan: Advancing Agentic AI in GitHub Copilot. https://github.com/features/copilot/whats-new, 2025
2025
-
[7]
Securing generative AI
IBM Institute for Business Value. Securing generative AI. https://www.ibm.com/thought-leadership/institute-business-value/ en-us/report/securing-generative-ai, 2024
2024
-
[8]
Essential Log management for your AI tool belt
Joseph Jang. Essential Log management for your AI tool belt. https://live-d9newrelic.pantheonsite.io/blog/best-practices/ the-eu-artificial-intelligence-act-and-observability?utm source= tldrdevops, 2024
2024
-
[9]
Introduction to Langfuse Tracing
Langfuse. Introduction to Langfuse Tracing. https://langfuse.com/ docs/tracing, 2025
2025
-
[10]
1-Click RCE To Steal Your OpenClaw Data and Keys (CVE-2026-25253)
Mav Levin. 1-Click RCE To Steal Your OpenClaw Data and Keys (CVE-2026-25253). https://depthfirst.com/post/ 1-click-rce-to-steal-your-moltbot-data-and-keys, 2026
2026
-
[11]
LlamaIndex Observability
LlamaIndex. LlamaIndex Observability. https://docs.llamaindex.ai/ en/stable/module guides/observability/, 2025
2025
-
[12]
OpenClaw — Personal AI Assistant
OpenClaw Contributors. OpenClaw — Personal AI Assistant. https: //github.com/openclaw/openclaw, 2026
2026
-
[13]
OpenDevin: An Open AI Agent for Soft- ware Engineering
OpenDevin Contributors. OpenDevin: An Open AI Agent for Soft- ware Engineering. https://github.com/AI-App/OpenDevin, 2025
2025
-
[14]
High-quality, ubiquitous, and portable telemetry to enable effective observability
OpenTelemetry. High-quality, ubiquitous, and portable telemetry to enable effective observability. https://github.com/open-telemetry, 2025
2025
-
[15]
Llm01:2025 prompt injection
OW ASP. Llm01:2025 prompt injection. https://genai.owasp.org/ llmrisk/llm01-prompt-injection/, 2025
2025
-
[16]
RestrictedPython
RestrictedPython Contributors. RestrictedPython. https://github.com/ zopefoundation/RestrictedPython, 2023
2023
-
[17]
LADYBUG: an LLM agent debugger for data-driven applications
Joel Rorseth, Parke Godfrey, Lukasz Golab, Divesh Srivastava, and Jarek Szlichta. LADYBUG: an LLM agent debugger for data-driven applications. In Alkis Simitsis, Bettina Kemme, Anna Queralt, Oscar Romero, and Petar Jovanovic, editors,Proceedings 28th Interna- tional Conference on Extending Database Technology, EDBT 2025, Barcelona, Spain, March 25-28, 202...
2025
-
[18]
Know when your LLM app is hallucinating or malfunc- tioning
TraceLoop. Know when your LLM app is hallucinating or malfunc- tioning. https://www.traceloop.com/, 2025
2025
-
[19]
Xu, Xiangru Tang, Mingchen Zhuge, Jiayi Pan, Yueqi Song, Bowen Li, Jaskirat Singh, Hoang H
Xingyao Wang, Boxuan Li, Yufan Song, Frank F. Xu, Xiangru Tang, Mingchen Zhuge, Jiayi Pan, Yueqi Song, Bowen Li, Jaskirat Singh, Hoang H. Tran, Fuqiang Li, Ren Ma, Mingzhang Zheng, Bill Qian, Yanjun Shao, Niklas Muennighoff, Yizhe Zhang, Binyuan Hui, Junyang Lin, Robert Brennan, Hao Peng, Heng Ji, and Graham Neubig. OpenHands: An Open Platform for AI Soft...
2024
-
[20]
Pi monorepo
Mario Zechner. Pi monorepo. https://github.com/badlogic/pi-mono, 2026
2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.