pith. sign in

arxiv: 2510.21236 · v3 · submitted 2025-10-24 · 💻 cs.CR · cs.AI· cs.SE

AgentBound: Securing Execution Boundaries of AI Agents

Pith reviewed 2026-05-18 05:09 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.SE
keywords access controlMCP serversAI agentsdeclarative policiespolicy enforcementAndroid permissionsLLM securityruntime containment
0
0 comments X

The pith

AgentBound introduces the first access control framework for MCP servers that uses declarative policies to contain malicious behavior without requiring server modifications.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

AI agents use the Model Context Protocol to connect large language models to external tools and environments, but thousands of MCP servers currently run with unrestricted host access and create a large attack surface. The paper presents AgentBound as a framework that applies declarative policies modeled on the Android permission system together with a runtime enforcement engine. Policies can be generated automatically from server source code with 80.9 percent accuracy across a dataset of the 296 most popular MCP servers. The enforcement layer blocks the majority of threats observed in several malicious servers while adding negligible overhead and without any changes to the servers themselves. This approach gives developers and project managers a concrete method to limit what MCP servers can do while keeping the tools usable for normal work.

Core claim

AgentBound is the first access control framework for MCP servers that combines a declarative policy mechanism, inspired by the Android permission model, with a policy enforcement engine that contains malicious behavior without requiring MCP server modifications. On a dataset of 296 popular servers, access control policies can be generated automatically from source code at 80.9 percent accuracy. The same engine blocks the majority of security threats present in several malicious MCP servers and introduces negligible overhead.

What carries the argument

Declarative policy mechanism modeled after Android permissions together with a runtime policy enforcement engine that intercepts and restricts server actions without code changes.

If this is right

  • Developers can add security boundaries to existing MCP servers without rewriting or recompiling their code.
  • Automatic policy generation from source code lowers the cost of creating and maintaining access rules for large numbers of servers.
  • The enforcement engine limits the impact of malicious MCP servers on the host system.
  • Negligible runtime overhead means the protection layer can be left on during normal agent operation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same declarative-policy-plus-enforcement pattern could be applied to other agent-tool connection protocols that currently grant broad host access.
  • Higher accuracy in automatic policy generation would reduce the remaining manual review needed for production use.
  • Organizations could adopt generated policies as a lightweight vetting step before allowing third-party MCP servers in their environments.

Load-bearing premise

That policies generated automatically from source code at 80.9 percent accuracy are sufficient to block real-world threats while preserving usability and that the enforcement engine can reliably contain malicious behavior without false negatives that matter in practice.

What would settle it

A malicious MCP server that still succeeds in performing a restricted harmful action after AgentBound policies are applied, or a large set of legitimate servers that become unusable because the generated policies incorrectly block required operations.

Figures

Figures reproduced from arXiv: 2510.21236 by Christoph B\"uhler, Guido Salvaneschi, Luca Di Grazia, Matteo Biagiola.

Figure 1
Figure 1. Figure 1: Interaction between user, agent, LLM, and two MCP servers. [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of AgentBound. Users interact with the AI agent, which interacts—through the policy enforcement engine (AgentBox)—with MCP servers communicating with the environment. The policy enforcement engine ensures an MCP server can only access the resources allowed by the access control policy (AgentManifest). 3.1 Access Control Policy We model the access control policy as a permission system that enumerat… view at source ↗
Figure 3
Figure 3. Figure 3: Distribution of Top-5 Access Control Policy [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: Counts per attack type, excluding generic attacks, split into preventable and non-preventable. [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Performance comparison in milliseconds (ms) of MCP servers with sandboxing (S) and native (N): [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
read the original abstract

Large Language Models (LLMs) have evolved into AI agents that interact with external tools and environments to perform complex tasks. The Model Context Protocol (MCP) has become the de facto standard for connecting agents with such resources, but security has lagged behind: thousands of MCP servers execute with unrestricted access to host systems, creating a broad attack surface. In this paper, we introduce AgentBound, the first access control framework for MCP servers. AgentBound combines a declarative policy mechanism, inspired by the Android permission model, with a policy enforcement engine that contains malicious behavior without requiring MCP server modifications. We build a dataset containing the 296 most popular MCP servers, and show that access control policies can be generated automatically from source code with 80.9% accuracy. We also show that AgentBound blocks the majority of security threats in several malicious MCP servers, and that the policy enforcement engine introduces negligible overhead. Our contributions provide developers and project managers with a foundation for securing MCP servers while maintaining productivity, enabling researchers and tool builders to explore new directions for declarative access control and MCP security.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces AgentBound as the first access control framework for Model Context Protocol (MCP) servers used by AI agents. It combines a declarative policy mechanism (inspired by the Android permission model) with a policy enforcement engine that contains malicious behavior without requiring modifications to the MCP servers. The authors collect a dataset of the 296 most popular MCP servers, demonstrate automatic policy generation from source code achieving 80.9% accuracy, show that the system blocks the majority of security threats in several malicious MCP servers, and report negligible overhead from the enforcement engine.

Significance. If the empirical results hold under a clearly specified threat model and error analysis, AgentBound would provide a practical, low-friction approach to securing the growing ecosystem of MCP servers. The combination of automatic policy generation from source code with an unmodified-server enforcement engine addresses a real deployment constraint and could serve as a foundation for declarative access control in agent-tool interactions. The work ships a concrete dataset and measurements on real-world servers, which strengthens its utility for follow-on research.

major comments (2)
  1. [Evaluation section] Evaluation section (policy generation results): The reported 80.9% accuracy on the 296-server dataset is presented without specifying the metric (micro/macro accuracy, precision, recall, or F1), without a per-permission-type breakdown (e.g., file-system write, network outbound, process spawn), and without false-negative rates on security-critical permissions. Because the central claim is that automatically generated policies plus the enforcement engine contain malicious behavior, the absence of this analysis leaves open the possibility that errors concentrate exactly on high-risk actions, undermining the 'blocks the majority of security threats' result.
  2. [Evaluation section] Threat-model and malicious-server experiments: The paper states that AgentBound 'blocks the majority of security threats in several malicious MCP servers' but provides no explicit description of the threat model, the specific malicious behaviors tested, the success criteria for blocking, or the number and diversity of malicious servers used. This information is load-bearing for validating that the enforcement engine reliably contains threats across the claimed diverse set of MCP servers.
minor comments (2)
  1. [Abstract] The abstract and introduction should clarify whether the 80.9% figure is an aggregate across all permissions or weighted by risk level.
  2. [System Design] Notation for policy elements (e.g., how declarative policies map to enforcement checks) could be made more explicit with a small example table.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We agree that additional specificity in the evaluation metrics and threat model description will strengthen the paper and address potential concerns about the reliability of our results. We will revise the manuscript to incorporate these clarifications.

read point-by-point responses
  1. Referee: [Evaluation section] Evaluation section (policy generation results): The reported 80.9% accuracy on the 296-server dataset is presented without specifying the metric (micro/macro accuracy, precision, recall, or F1), without a per-permission-type breakdown (e.g., file-system write, network outbound, process spawn), and without false-negative rates on security-critical permissions. Because the central claim is that automatically generated policies plus the enforcement engine contain malicious behavior, the absence of this analysis leaves open the possibility that errors concentrate exactly on high-risk actions, undermining the 'blocks the majority of security threats' result.

    Authors: We agree that the current presentation of the 80.9% accuracy figure lacks necessary detail. In the revised manuscript we will explicitly define the metric as overall accuracy (fraction of correctly classified permissions across the entire 296-server dataset). We will add a per-permission-type breakdown with precision, recall, and F1 scores for categories including file-system write, network outbound, and process spawn. We will also report false-negative rates specifically for security-critical permissions. These additions will allow readers to verify that misclassifications do not concentrate on high-risk actions and will directly support the claim that the generated policies plus enforcement engine contain malicious behavior. revision: yes

  2. Referee: [Evaluation section] Threat-model and malicious-server experiments: The paper states that AgentBound 'blocks the majority of security threats in several malicious MCP servers' but provides no explicit description of the threat model, the specific malicious behaviors tested, the success criteria for blocking, or the number and diversity of malicious servers used. This information is load-bearing for validating that the enforcement engine reliably contains threats across the claimed diverse set of MCP servers.

    Authors: We acknowledge that the malicious-server experiments require a clearer threat model and experimental details. In the revision we will add an explicit threat-model subsection describing the assumed attacker (malicious MCP servers attempting unauthorized host access) and the specific behaviors tested (e.g., unauthorized file writes, outbound network connections, and process spawning). We will define success criteria (enforcement engine blocks the action without server modification or crash) and report the exact number and diversity of malicious servers evaluated. These changes will substantiate the statement that AgentBound blocks the majority of security threats. revision: yes

Circularity Check

0 steps flagged

No circularity: central claims rest on new system design and independent empirical measurements

full rationale

The paper introduces AgentBound as a new access-control framework, constructs a fresh dataset of 296 MCP servers, and reports measured outcomes (80.9% policy-generation accuracy, threat blocking on malicious servers, negligible overhead). None of these steps reduce by construction to prior fitted parameters, self-citations, or redefinitions; the accuracy figure and blocking results are direct observations on collected data rather than tautological outputs of the same inputs. Reliance on the Android permission model is an external inspiration, not a load-bearing self-citation chain. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The paper introduces a new enforcement architecture whose correctness depends on standard security assumptions about policy specification and runtime mediation rather than new fitted parameters or invented physical entities.

axioms (2)
  • domain assumption Source code of MCP servers contains sufficient static information to infer required permissions with usable accuracy
    The 80.9% automatic generation result rests on this assumption about code analysis.
  • domain assumption The policy enforcement engine can intercept and block all relevant host operations without modifying the server binary
    Central to the claim that no server changes are needed.
invented entities (1)
  • AgentBound policy enforcement engine no independent evidence
    purpose: Runtime mediation of MCP server actions according to declarative policies
    The core new component proposed by the paper; no independent evidence outside the system itself is described.

pith-pipeline@v0.9.0 · 5727 in / 1470 out tokens · 48555 ms · 2026-05-18T05:09:07.864462+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Do Coding Agents Understand Least-Privilege Authorization?

    cs.CR 2026-05 unverdicted novelty 7.0

    Coding agents struggle to infer least-privilege file permissions by omitting needed accesses while granting unused or sensitive ones, but Sufficiency-Tightness Decomposition improves sensitive-task success by up to 15...

  2. MCP-DPT: A Defense-Placement Taxonomy and Coverage Analysis for Model Context Protocol Security

    cs.CR 2026-04 conditional novelty 7.0

    MCP-DPT creates a defense-placement taxonomy that organizes MCP threats and defenses across six architectural layers, revealing mostly tool-centric protections and gaps at orchestration, transport, and supply-chain layers.

  3. From Component Manipulation to System Compromise: Understanding and Detecting Malicious MCP Servers

    cs.CR 2026-04 unverdicted novelty 7.0

    Presents a component-centric PoC dataset of malicious MCP servers and a two-stage behavioral deviation detector Connor achieving 94.6% F1-score.

  4. Exploiting LLM Agent Supply Chains via Payload-less Skills

    cs.CR 2026-05 conditional novelty 6.0

    Semantic Compliance Hijacking lets attackers hijack LLM agents by disguising malicious instructions as compliance rules in skills, reaching up to 77.67% success on confidentiality breaches and 67.33% on RCE while evad...

  5. Tracking Capabilities for Safer Agents

    cs.AI 2026-03 unverdicted novelty 6.0

    AI agents can generate code in a capability-safe Scala dialect that statically prevents information leakage and malicious side effects while preserving task performance.

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages · cited by 5 Pith papers · 5 internal anchors

  1. [1]

    InProceedings of the 16th ACM Workshop on Artificial Intelligence and Security (AISec @ CCS 2023)

    Sahar Abdelnabi et al. “Not What You’ve Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection”. In:Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security, AISec 2023, Copenhagen, Denmark, 30 November 2023. Ed. by Maura Pintor, Xinyun Chen, and Florian Tramèr. ACM, 2023, pp. 79–90.doi: 10.11...

  2. [2]

    Ofir Abu.When Public Prompts Turn Into Local Shells: ‘CurXecute’ – RCE in Cursor via MCP Auto -Start | AIM.url: https://www.aim.security/post/when-public-prompts-turn-into-local-shells-rce-in-cursor-via-mcp-auto-start (visited on 09/11/2025)

  3. [3]

    Anthropic

    Anthropic.Introducing the Model Context Protocol. Anthropic. Nov. 25, 2024.url: https://www.anthropic.com/ news/model-context-protocol (visited on 07/29/2025)

  4. [4]

    Pscout: analyzing the android permission specification

    Kathy Wain Yee Au et al. “Pscout: analyzing the android permission specification”. In:Proceedings of the 2012 ACM conference on Computer and communications security. 2012, pp. 217–228

  5. [5]

    Accessed 2025-07-25

    Luca Beurer-Kellner and Marc Fischer.Introducing MCP-Scan: Protecting MCP with Invariant. Accessed 2025-07-25. Apr. 2025.url: https://invariantlabs.ai/blog/introducing-mcp-scan (visited on 07/24/2025)

  6. [6]

    A survey on various threats and current state of security in android platform

    Parnika Bhat and Kamlesh Dutta. “A survey on various threats and current state of security in android platform”. In:ACM Computing Surveys (CSUR)52.1 (2019), pp. 1–35

  7. [7]

    State of the Sandbox: Investigating macOS Application Security

    Maximilian Blochberger et al. “State of the Sandbox: Investigating macOS Application Security”. In:Proceedings of the 18th ACM Workshop on Privacy in the Electronic Society. CCS ’19: 2019 ACM SIGSAC Conference on Computer and Communications Security. London United Kingdom: ACM, Nov. 11, 2019, pp. 150–161.isbn: 978-1-4503-6830-8.doi: 10.1145/3338498.3358654

  8. [8]

    Language models are few-shot learners

    Tom Brown et al. “Language models are few-shot learners”. In:Advances in neural information processing systems 33 (2020), pp. 1877–1901

  9. [9]

    Evaluating Large Language Models Trained on Code

    Mark Chen et al. “Evaluating Large Language Models Trained on Code”. In:CoRRabs/2107.03374 (2021). arXiv: 2107.03374.url: https://arxiv.org/abs/2107.03374

  10. [10]

    https: //github.com/kapilduraphe/mcp-watch

    Kapil Duraphe.mcp-watch: A comprehensive security scanner for Model Context Protocol (MCP) servers. https: //github.com/kapilduraphe/mcp-watch. Accessed 2025-09-02. 2025

  11. [11]

    We should identify and mitigate third-party safety risks in mcp-powered agent systems.arXiv preprint arXiv:2506.13666, 2025

    Junfeng Fang et al. “We Should Identify and Mitigate Third-Party Safety Risks in MCP-Powered Agent Systems”. In:CoRRabs/2506.13666 (2025).doi: 10.48550/ARXIV.2506.13666. arXiv: 2506.13666.url: https://doi.org/10. 48550/arXiv.2506.13666

  12. [12]

    An Updated Performance Comparison of Virtual Machines and Linux Containers

    Wes Felter et al. “An Updated Performance Comparison of Virtual Machines and Linux Containers”. In:2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). Mar. 2015, pp. 171–172.doi: 10.1109/ ISPASS.2015.7095802

  13. [13]

    Magentic-One: A Generalist Multi-Agent System for Solving Complex Tasks

    Adam Fourney et al. “Magentic-One: A Generalist Multi-Agent System for Solving Complex Tasks”. In:CoRR abs/2411.04468 (2024).doi: 10.48550/ARXIV.2411.04468. arXiv: 2411.04468.url: https://doi.org/10.48550/arXiv. 2411.04468. Manuscript submitted to ACM 20 Christoph Bühler, Matteo Biagiola, Luca Di Grazia, and Guido Salvaneschi

  14. [14]

    Gupta, Taylor Berg-Kirkpatrick, and Earlence Fernandes

    Xiaohan Fu et al. “Imprompter: Tricking LLM Agents into Improper Tool Use”. In:CoRRabs/2410.14923 (2024). doi: 10.48550/ARXIV.2410.14923. arXiv: 2410.14923.url: https://doi.org/10.48550/arXiv.2410.14923

  15. [15]

    https : / / github

    GitHub - harishsg993010/damn-vulnerable-MCP-server: Damn Vulnerable MCP Server. https : / / github . com / harishsg993010/damn-vulnerable-MCP-server. [Accessed 10-09-2025]

  16. [16]

    https://github.com/MCP-Security/MCP-Artifact

    GitHub - MCP-Security/MCP-Artifact — github.com. https://github.com/MCP-Security/MCP-Artifact. [Accessed 10-09-2025]

  17. [17]

    Android Developers.url: https://developer.android.com/reference/ android/Manifest.permission (visited on 08/19/2025)

    Google.Manifest.Permission | API Reference. Android Developers.url: https://developer.android.com/reference/ android/Manifest.permission (visited on 08/19/2025)

  18. [18]

    Model Context Protocol (MCP) at First Glance: Studying the Security and Maintainability of MCP Servers

    Mohammed Mehedi Hasan et al. “Model Context Protocol (MCP) at First Glance: Studying the Security and Maintainability of MCP Servers”. In:CoRRabs/2506.13538 (2025).doi: 10.48550/ARXIV.2506.13538. arXiv: 2506.13538.url: https://doi.org/10.48550/arXiv.2506.13538

  19. [19]

    arXiv preprint arXiv:2505.24201 , year=

    Xu He et al. “SentinelAgent: Graph-based Anomaly Detection in Multi-Agent Systems”. In:CoRRabs/2505.24201 (2025).doi: 10.48550/ARXIV.2505.24201. arXiv: 2505.24201.url: https://doi.org/10.48550/arXiv.2505.24201

  20. [20]

    Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions

    Xinyi Hou et al. “Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions”. In:CoRRabs/2503.23278 (2025).doi: 10.48550/ARXIV.2503.23278. arXiv: 2503.23278.url: https://doi.org/10. 48550/arXiv.2503.23278

  21. [21]

    MCIP: Protecting MCP Safety via Model Contextual Integrity Protocol

    Huihao Jing et al. “MCIP: Protecting MCP Safety via Model Contextual Integrity Protocol”. In:CoRRabs/2505.14590 (2025).doi: 10.48550/ARXIV.2505.14590. arXiv: 2505.14590.url: https://doi.org/10.48550/arXiv.2505.14590

  22. [22]

    JSON-PRC Working Group.JSON-RPC 2.0 Specification. Mar. 26, 2010.url: https://www.jsonrpc.org/specification (visited on 07/29/2025)

  23. [23]

    Mcp guardian: A security-first layer for safeguarding mcp-based ai system

    Sonu Kumar et al. “MCP Guardian: A Security-First Layer for Safeguarding MCP-Based AI System”. In:CoRR abs/2504.12757 (2025).doi: 10.48550/ARXIV.2504.12757. arXiv: 2504.12757.url: https://doi.org/10.48550/arXiv. 2504.12757

  24. [24]

    https: //github.com/lasso-security/mcp-gateway

    Lasso-Security.MCP-Gateway: A plugin-based security gateway for Model Context Protocol (MCP) servers. https: //github.com/lasso-security/mcp-gateway. Accessed 2025-09-02. 2025

  25. [25]

    Zhihao Li et al.We Urgently Need Privilege Management in MCP: A Measurement of API Usage in MCP Ecosystems

  26. [26]

    arXiv: 2507.06250[cs.CR].url: https://arxiv.org/abs/2507.06250

  27. [27]

    Jailbreaking ChatGPT via Prompt Engineering: An Empirical Study

    Yi Liu et al. “Jailbreaking ChatGPT via Prompt Engineering: An Empirical Study”. In:CoRRabs/2305.13860 (2023). doi: 10.48550/ARXIV.2305.13860. arXiv: 2305.13860.url: https://doi.org/10.48550/arXiv.2305.13860

  28. [28]

    In 32nd European Conference on Object-Oriented Programming (ECOOP 2018)

    Petr Maj et al. “The Fault in Our Stars: Designing Reproducible Large-scale Code Analysis Experiments”. In: 38th European Conference on Object-Oriented Programming (ECOOP 2024). Ed. by Jonathan Aldrich and Guido Salvaneschi. Vol. 313. Leibniz International Proceedings in Informatics (LIPIcs). Dagstuhl, Germany: Schloss Dagstuhl – Leibniz-Zentrum für Infor...

  29. [29]

    https://www.pulsemcp.com/servers

    MCP Server Directory: 6010+ updated daily | PulseMCP — pulsemcp.com. https://www.pulsemcp.com/servers. [Accessed 10-09-2025]

  30. [30]

    https://github.com/MCP- Defender/MCP- Defender

    MCP-Defender.MCP-Defender: Desktop app that automatically scans and blocks malicious MCP traffic in AI apps like Cursor, Claude, VS Code and Windsurf. https://github.com/MCP- Defender/MCP- Defender. Accessed 2025-09-02. 2025

  31. [31]

    ToolFuzz - Automated Agent Tool Testing

    Ivan Milev et al. “ToolFuzz - Automated Agent Tool Testing”. In:CoRRabs/2503.04479 (2025).doi: 10.48550/ ARXIV.2503.04479. arXiv: 2503.04479.url: https://doi.org/10.48550/arXiv.2503.04479. Manuscript submitted to ACM Securing AI Agent Execution 21

  32. [32]

    Model Context Protocol

    modelcontextprotocol.io contributors.Specification. Model Context Protocol. June 18, 2025.url: https : / / modelcontextprotocol.io/specification/2025-06-18 (visited on 07/29/2025)

  33. [33]

    Hypervisors vs. Lightweight Virtualization: A Performance Comparison

    Roberto Morabito, Jimmy Kjällman, and Miika Komu. “Hypervisors vs. Lightweight Virtualization: A Performance Comparison”. In:2015 IEEE International Conference on Cloud Engineering. 2015 IEEE International Conference on Cloud Engineering. Mar. 2015, pp. 386–393.doi: 10.1109/IC2E.2015.74

  34. [34]

    GitHub.url: https://github.com/orgs/community/discussions/166370 (visited on 09/11/2025)

    NachoBecerra.Severe Data Loss Caused by GitHub Copilot – Request for Acknowledgment and Compensation· Community·Discussion #166370. GitHub.url: https://github.com/orgs/community/discussions/166370 (visited on 09/11/2025)

  35. [35]

    Enterprise-grade security for the model context protocol (mcp): Frameworks and mitigation strategies.arXiv preprint arXiv:2504.08623, 2025

    Vineeth Sai Narajala and Idan Habler. “Enterprise-Grade Security for the Model Context Protocol (MCP): Frameworks and Mitigation Strategies”. In:CoRRabs/2504.08623 (2025).doi: 10.48550/ARXIV.2504.08623. arXiv: 2504.08623.url: https://doi.org/10.48550/arXiv.2504.08623

  36. [36]

    OpenAI.Function Calling and Other API Updates. Mar. 13, 2024.url: https://openai.com/index/function-calling- and-other-api-updates/ (visited on 08/21/2025)

  37. [37]

    An Empirical Study of the Non-Determinism of ChatGPT in Code Generation

    Shuyin Ouyang et al. “An Empirical Study of the Non-Determinism of ChatGPT in Code Generation”. In:ACM Trans. Softw. Eng. Methodol.34.2 (2025), 42:1–42:28.doi: 10.1145/3697010.url: https://doi.org/10.1145/3697010

  38. [38]

    Mcp safety audit: Llms with the model context protocol allow major security exploits,

    Brandon Radosevich and John Halloran. “MCP Safety Audit: LLMs with the Model Context Protocol Allow Major Security Exploits”. In:CoRRabs/2504.03767 (2025).doi: 10.48550/ARXIV.2504.03767. arXiv: 2504.03767.url: https://doi.org/10.48550/arXiv.2504.03767

  39. [39]

    Security of OS-Level Virtualization Technologies

    Elena Reshetova et al. “Security of OS-Level Virtualization Technologies”. In:Secure IT Systems. Ed. by Karin Bernsmed and Simone Fischer-Hübner. Cham: Springer International Publishing, 2014, pp. 77–93.isbn: 978-3- 319-11599-3.doi: 10.1007/978-3-319-11599-3_5

  40. [40]

    https : / / github

    riseandignite.MCP-Shield: Security scanner for MCP (Model Context Protocol) servers. https : / / github . com / riseandignite/mcp-shield. Accessed 2025-09-02. 2025

  41. [41]

    Great, Now Write an Article About That: The Crescendo Multi-Turn LLM Jailbreak Attack

    Mark Russinovich, Ahmed Salem, and Ronen Eldan. “Great, Now Write an Article About That: The Crescendo Multi-Turn LLM Jailbreak Attack”. In: 34th USENIX Security Symposium (USENIX Security 25). 2025, pp. 2421– 2440.isbn: 978-1-939133-52-6.url: https : / / www. usenix . org / conference / usenixsecurity25 / presentation / russinovich (visited on 09/09/2025)

  42. [42]

    Hao Song et al.Beyond the Protocol: Unveiling Attack Vectors in the Model Context Protocol (MCP) Ecosystem. 2025. arXiv: 2506.02040[cs.CR].url: https://arxiv.org/abs/2506.02040

  43. [43]

    Unosecur.AI Agent Wiped Live DB: 4 -Step Identity-First Security Plan.url: https://www.unosecur.com/blog/when- an-ai-agent-wipes-a-live-database-identity-first-controls-to-stop-agentic-ai-disasters (visited on 08/21/2025)

  44. [44]

    July 28, 2025.url: https://jack- vanlightly.com/blog/2025/7/28/remediation-what-happens-after-ai-goes-wrong (visited on 09/11/2025)

    Jack Vanlightly.Remediation: What Happens after AI Goes Wrong?Jack Vanlightly. July 28, 2025.url: https://jack- vanlightly.com/blog/2025/7/28/remediation-what-happens-after-ai-goes-wrong (visited on 09/11/2025)

  45. [45]

    Security of Internet of Agents: Attacks and Countermeasures

    Yuntao Wang et al. “Security of Internet of Agents: Attacks and Countermeasures”. In:IEEE Open Journal of the Computer Society(2025), pp. 1–12.doi: 10.1109/OJCS.2025.3589638

  46. [46]

    A new era in llm security: Exploring security con- cerns in real-world llm-based systems,

    Fangzhou Wu et al. “A new era in llm security: Exploring security concerns in real-world llm-based systems”. In: arXiv preprint arXiv:2402.18649(2024)

  47. [47]

    Openagents: An open platform for language agents in the wild,

    Tianbao Xie et al. “OpenAgents: An Open Platform for Language Agents in the Wild”. In:CoRRabs/2310.10634 (2023).doi: 10.48550/ARXIV.2310.10634. arXiv: 2310.10634.url: https://doi.org/10.48550/arXiv.2310.10634. Manuscript submitted to ACM 22 Christoph Bühler, Matteo Biagiola, Luca Di Grazia, and Guido Salvaneschi

  48. [48]

    Auto-gpt for online decision making: Benchmarks and additional opinions.arXiv preprint arXiv:2306.02224, 2023

    Hui Yang, Sifu Yue, and Yunzhong He. “Auto-GPT for Online Decision Making: Benchmarks and Additional Opinions”. In:CoRRabs/2306.02224 (2023).doi: 10.48550/ARXIV.2306.02224. arXiv: 2306.02224.url: https: //doi.org/10.48550/arXiv.2306.02224

  49. [49]

    In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis (Vienna, Austria) (ISSTA 2024)

    Yuntong Zhang et al. “AutoCodeRover: Autonomous Program Improvement”. In:Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. ISSTA 2024. Vienna, Austria: Association for Computing Machinery, 2024, pp. 1592–1604.isbn: 9798400706127.doi: 10.1145/3650212.3680384.url: https://doi.org/10.1145/3650212.3680384. Manuscrip...