AgentBound: Securing Execution Boundaries of AI Agents
Pith reviewed 2026-05-18 05:09 UTC · model grok-4.3
The pith
AgentBound introduces the first access control framework for MCP servers that uses declarative policies to contain malicious behavior without requiring server modifications.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
AgentBound is the first access control framework for MCP servers that combines a declarative policy mechanism, inspired by the Android permission model, with a policy enforcement engine that contains malicious behavior without requiring MCP server modifications. On a dataset of 296 popular servers, access control policies can be generated automatically from source code at 80.9 percent accuracy. The same engine blocks the majority of security threats present in several malicious MCP servers and introduces negligible overhead.
What carries the argument
Declarative policy mechanism modeled after Android permissions together with a runtime policy enforcement engine that intercepts and restricts server actions without code changes.
If this is right
- Developers can add security boundaries to existing MCP servers without rewriting or recompiling their code.
- Automatic policy generation from source code lowers the cost of creating and maintaining access rules for large numbers of servers.
- The enforcement engine limits the impact of malicious MCP servers on the host system.
- Negligible runtime overhead means the protection layer can be left on during normal agent operation.
Where Pith is reading between the lines
- The same declarative-policy-plus-enforcement pattern could be applied to other agent-tool connection protocols that currently grant broad host access.
- Higher accuracy in automatic policy generation would reduce the remaining manual review needed for production use.
- Organizations could adopt generated policies as a lightweight vetting step before allowing third-party MCP servers in their environments.
Load-bearing premise
That policies generated automatically from source code at 80.9 percent accuracy are sufficient to block real-world threats while preserving usability and that the enforcement engine can reliably contain malicious behavior without false negatives that matter in practice.
What would settle it
A malicious MCP server that still succeeds in performing a restricted harmful action after AgentBound policies are applied, or a large set of legitimate servers that become unusable because the generated policies incorrectly block required operations.
Figures
read the original abstract
Large Language Models (LLMs) have evolved into AI agents that interact with external tools and environments to perform complex tasks. The Model Context Protocol (MCP) has become the de facto standard for connecting agents with such resources, but security has lagged behind: thousands of MCP servers execute with unrestricted access to host systems, creating a broad attack surface. In this paper, we introduce AgentBound, the first access control framework for MCP servers. AgentBound combines a declarative policy mechanism, inspired by the Android permission model, with a policy enforcement engine that contains malicious behavior without requiring MCP server modifications. We build a dataset containing the 296 most popular MCP servers, and show that access control policies can be generated automatically from source code with 80.9% accuracy. We also show that AgentBound blocks the majority of security threats in several malicious MCP servers, and that the policy enforcement engine introduces negligible overhead. Our contributions provide developers and project managers with a foundation for securing MCP servers while maintaining productivity, enabling researchers and tool builders to explore new directions for declarative access control and MCP security.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces AgentBound as the first access control framework for Model Context Protocol (MCP) servers used by AI agents. It combines a declarative policy mechanism (inspired by the Android permission model) with a policy enforcement engine that contains malicious behavior without requiring modifications to the MCP servers. The authors collect a dataset of the 296 most popular MCP servers, demonstrate automatic policy generation from source code achieving 80.9% accuracy, show that the system blocks the majority of security threats in several malicious MCP servers, and report negligible overhead from the enforcement engine.
Significance. If the empirical results hold under a clearly specified threat model and error analysis, AgentBound would provide a practical, low-friction approach to securing the growing ecosystem of MCP servers. The combination of automatic policy generation from source code with an unmodified-server enforcement engine addresses a real deployment constraint and could serve as a foundation for declarative access control in agent-tool interactions. The work ships a concrete dataset and measurements on real-world servers, which strengthens its utility for follow-on research.
major comments (2)
- [Evaluation section] Evaluation section (policy generation results): The reported 80.9% accuracy on the 296-server dataset is presented without specifying the metric (micro/macro accuracy, precision, recall, or F1), without a per-permission-type breakdown (e.g., file-system write, network outbound, process spawn), and without false-negative rates on security-critical permissions. Because the central claim is that automatically generated policies plus the enforcement engine contain malicious behavior, the absence of this analysis leaves open the possibility that errors concentrate exactly on high-risk actions, undermining the 'blocks the majority of security threats' result.
- [Evaluation section] Threat-model and malicious-server experiments: The paper states that AgentBound 'blocks the majority of security threats in several malicious MCP servers' but provides no explicit description of the threat model, the specific malicious behaviors tested, the success criteria for blocking, or the number and diversity of malicious servers used. This information is load-bearing for validating that the enforcement engine reliably contains threats across the claimed diverse set of MCP servers.
minor comments (2)
- [Abstract] The abstract and introduction should clarify whether the 80.9% figure is an aggregate across all permissions or weighted by risk level.
- [System Design] Notation for policy elements (e.g., how declarative policies map to enforcement checks) could be made more explicit with a small example table.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We agree that additional specificity in the evaluation metrics and threat model description will strengthen the paper and address potential concerns about the reliability of our results. We will revise the manuscript to incorporate these clarifications.
read point-by-point responses
-
Referee: [Evaluation section] Evaluation section (policy generation results): The reported 80.9% accuracy on the 296-server dataset is presented without specifying the metric (micro/macro accuracy, precision, recall, or F1), without a per-permission-type breakdown (e.g., file-system write, network outbound, process spawn), and without false-negative rates on security-critical permissions. Because the central claim is that automatically generated policies plus the enforcement engine contain malicious behavior, the absence of this analysis leaves open the possibility that errors concentrate exactly on high-risk actions, undermining the 'blocks the majority of security threats' result.
Authors: We agree that the current presentation of the 80.9% accuracy figure lacks necessary detail. In the revised manuscript we will explicitly define the metric as overall accuracy (fraction of correctly classified permissions across the entire 296-server dataset). We will add a per-permission-type breakdown with precision, recall, and F1 scores for categories including file-system write, network outbound, and process spawn. We will also report false-negative rates specifically for security-critical permissions. These additions will allow readers to verify that misclassifications do not concentrate on high-risk actions and will directly support the claim that the generated policies plus enforcement engine contain malicious behavior. revision: yes
-
Referee: [Evaluation section] Threat-model and malicious-server experiments: The paper states that AgentBound 'blocks the majority of security threats in several malicious MCP servers' but provides no explicit description of the threat model, the specific malicious behaviors tested, the success criteria for blocking, or the number and diversity of malicious servers used. This information is load-bearing for validating that the enforcement engine reliably contains threats across the claimed diverse set of MCP servers.
Authors: We acknowledge that the malicious-server experiments require a clearer threat model and experimental details. In the revision we will add an explicit threat-model subsection describing the assumed attacker (malicious MCP servers attempting unauthorized host access) and the specific behaviors tested (e.g., unauthorized file writes, outbound network connections, and process spawning). We will define success criteria (enforcement engine blocks the action without server modification or crash) and report the exact number and diversity of malicious servers evaluated. These changes will substantiate the statement that AgentBound blocks the majority of security threats. revision: yes
Circularity Check
No circularity: central claims rest on new system design and independent empirical measurements
full rationale
The paper introduces AgentBound as a new access-control framework, constructs a fresh dataset of 296 MCP servers, and reports measured outcomes (80.9% policy-generation accuracy, threat blocking on malicious servers, negligible overhead). None of these steps reduce by construction to prior fitted parameters, self-citations, or redefinitions; the accuracy figure and blocking results are direct observations on collected data rather than tautological outputs of the same inputs. Reliance on the Android permission model is an external inspiration, not a load-bearing self-citation chain. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Source code of MCP servers contains sufficient static information to infer required permissions with usable accuracy
- domain assumption The policy enforcement engine can intercept and block all relevant host operations without modifying the server binary
invented entities (1)
-
AgentBound policy enforcement engine
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
AgentBound combines a declarative policy mechanism, inspired by the Android permission model, with a policy enforcement engine that contains malicious behavior without requiring MCP server modifications.
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We build a dataset containing the 296 most popular MCP servers, and show that access control policies can be generated automatically from source code with 80.9% accuracy.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 5 Pith papers
-
Do Coding Agents Understand Least-Privilege Authorization?
Coding agents struggle to infer least-privilege file permissions by omitting needed accesses while granting unused or sensitive ones, but Sufficiency-Tightness Decomposition improves sensitive-task success by up to 15...
-
MCP-DPT: A Defense-Placement Taxonomy and Coverage Analysis for Model Context Protocol Security
MCP-DPT creates a defense-placement taxonomy that organizes MCP threats and defenses across six architectural layers, revealing mostly tool-centric protections and gaps at orchestration, transport, and supply-chain layers.
-
From Component Manipulation to System Compromise: Understanding and Detecting Malicious MCP Servers
Presents a component-centric PoC dataset of malicious MCP servers and a two-stage behavioral deviation detector Connor achieving 94.6% F1-score.
-
Exploiting LLM Agent Supply Chains via Payload-less Skills
Semantic Compliance Hijacking lets attackers hijack LLM agents by disguising malicious instructions as compliance rules in skills, reaching up to 77.67% success on confidentiality breaches and 67.33% on RCE while evad...
-
Tracking Capabilities for Safer Agents
AI agents can generate code in a capability-safe Scala dialect that statically prevents information leakage and malicious side effects while preserving task performance.
Reference graph
Works this paper leans on
-
[1]
InProceedings of the 16th ACM Workshop on Artificial Intelligence and Security (AISec @ CCS 2023)
Sahar Abdelnabi et al. “Not What You’ve Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection”. In:Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security, AISec 2023, Copenhagen, Denmark, 30 November 2023. Ed. by Maura Pintor, Xinyun Chen, and Florian Tramèr. ACM, 2023, pp. 79–90.doi: 10.11...
-
[2]
Ofir Abu.When Public Prompts Turn Into Local Shells: ‘CurXecute’ – RCE in Cursor via MCP Auto -Start | AIM.url: https://www.aim.security/post/when-public-prompts-turn-into-local-shells-rce-in-cursor-via-mcp-auto-start (visited on 09/11/2025)
work page 2025
- [3]
-
[4]
Pscout: analyzing the android permission specification
Kathy Wain Yee Au et al. “Pscout: analyzing the android permission specification”. In:Proceedings of the 2012 ACM conference on Computer and communications security. 2012, pp. 217–228
work page 2012
-
[5]
Luca Beurer-Kellner and Marc Fischer.Introducing MCP-Scan: Protecting MCP with Invariant. Accessed 2025-07-25. Apr. 2025.url: https://invariantlabs.ai/blog/introducing-mcp-scan (visited on 07/24/2025)
work page 2025
-
[6]
A survey on various threats and current state of security in android platform
Parnika Bhat and Kamlesh Dutta. “A survey on various threats and current state of security in android platform”. In:ACM Computing Surveys (CSUR)52.1 (2019), pp. 1–35
work page 2019
-
[7]
State of the Sandbox: Investigating macOS Application Security
Maximilian Blochberger et al. “State of the Sandbox: Investigating macOS Application Security”. In:Proceedings of the 18th ACM Workshop on Privacy in the Electronic Society. CCS ’19: 2019 ACM SIGSAC Conference on Computer and Communications Security. London United Kingdom: ACM, Nov. 11, 2019, pp. 150–161.isbn: 978-1-4503-6830-8.doi: 10.1145/3338498.3358654
-
[8]
Language models are few-shot learners
Tom Brown et al. “Language models are few-shot learners”. In:Advances in neural information processing systems 33 (2020), pp. 1877–1901
work page 2020
-
[9]
Evaluating Large Language Models Trained on Code
Mark Chen et al. “Evaluating Large Language Models Trained on Code”. In:CoRRabs/2107.03374 (2021). arXiv: 2107.03374.url: https://arxiv.org/abs/2107.03374
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[10]
https: //github.com/kapilduraphe/mcp-watch
Kapil Duraphe.mcp-watch: A comprehensive security scanner for Model Context Protocol (MCP) servers. https: //github.com/kapilduraphe/mcp-watch. Accessed 2025-09-02. 2025
work page 2025
-
[11]
Junfeng Fang et al. “We Should Identify and Mitigate Third-Party Safety Risks in MCP-Powered Agent Systems”. In:CoRRabs/2506.13666 (2025).doi: 10.48550/ARXIV.2506.13666. arXiv: 2506.13666.url: https://doi.org/10. 48550/arXiv.2506.13666
-
[12]
An Updated Performance Comparison of Virtual Machines and Linux Containers
Wes Felter et al. “An Updated Performance Comparison of Virtual Machines and Linux Containers”. In:2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). Mar. 2015, pp. 171–172.doi: 10.1109/ ISPASS.2015.7095802
-
[13]
Magentic-One: A Generalist Multi-Agent System for Solving Complex Tasks
Adam Fourney et al. “Magentic-One: A Generalist Multi-Agent System for Solving Complex Tasks”. In:CoRR abs/2411.04468 (2024).doi: 10.48550/ARXIV.2411.04468. arXiv: 2411.04468.url: https://doi.org/10.48550/arXiv. 2411.04468. Manuscript submitted to ACM 20 Christoph Bühler, Matteo Biagiola, Luca Di Grazia, and Guido Salvaneschi
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2411.04468 2024
-
[14]
Gupta, Taylor Berg-Kirkpatrick, and Earlence Fernandes
Xiaohan Fu et al. “Imprompter: Tricking LLM Agents into Improper Tool Use”. In:CoRRabs/2410.14923 (2024). doi: 10.48550/ARXIV.2410.14923. arXiv: 2410.14923.url: https://doi.org/10.48550/arXiv.2410.14923
-
[15]
GitHub - harishsg993010/damn-vulnerable-MCP-server: Damn Vulnerable MCP Server. https : / / github . com / harishsg993010/damn-vulnerable-MCP-server. [Accessed 10-09-2025]
work page 2025
-
[16]
https://github.com/MCP-Security/MCP-Artifact
GitHub - MCP-Security/MCP-Artifact — github.com. https://github.com/MCP-Security/MCP-Artifact. [Accessed 10-09-2025]
work page 2025
-
[17]
Google.Manifest.Permission | API Reference. Android Developers.url: https://developer.android.com/reference/ android/Manifest.permission (visited on 08/19/2025)
work page 2025
-
[18]
Mohammed Mehedi Hasan et al. “Model Context Protocol (MCP) at First Glance: Studying the Security and Maintainability of MCP Servers”. In:CoRRabs/2506.13538 (2025).doi: 10.48550/ARXIV.2506.13538. arXiv: 2506.13538.url: https://doi.org/10.48550/arXiv.2506.13538
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2506.13538 2025
-
[19]
arXiv preprint arXiv:2505.24201 , year=
Xu He et al. “SentinelAgent: Graph-based Anomaly Detection in Multi-Agent Systems”. In:CoRRabs/2505.24201 (2025).doi: 10.48550/ARXIV.2505.24201. arXiv: 2505.24201.url: https://doi.org/10.48550/arXiv.2505.24201
-
[20]
Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions
Xinyi Hou et al. “Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions”. In:CoRRabs/2503.23278 (2025).doi: 10.48550/ARXIV.2503.23278. arXiv: 2503.23278.url: https://doi.org/10. 48550/arXiv.2503.23278
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2503.23278 2025
-
[21]
MCIP: Protecting MCP Safety via Model Contextual Integrity Protocol
Huihao Jing et al. “MCIP: Protecting MCP Safety via Model Contextual Integrity Protocol”. In:CoRRabs/2505.14590 (2025).doi: 10.48550/ARXIV.2505.14590. arXiv: 2505.14590.url: https://doi.org/10.48550/arXiv.2505.14590
-
[22]
JSON-PRC Working Group.JSON-RPC 2.0 Specification. Mar. 26, 2010.url: https://www.jsonrpc.org/specification (visited on 07/29/2025)
work page 2010
-
[23]
Mcp guardian: A security-first layer for safeguarding mcp-based ai system
Sonu Kumar et al. “MCP Guardian: A Security-First Layer for Safeguarding MCP-Based AI System”. In:CoRR abs/2504.12757 (2025).doi: 10.48550/ARXIV.2504.12757. arXiv: 2504.12757.url: https://doi.org/10.48550/arXiv. 2504.12757
-
[24]
https: //github.com/lasso-security/mcp-gateway
Lasso-Security.MCP-Gateway: A plugin-based security gateway for Model Context Protocol (MCP) servers. https: //github.com/lasso-security/mcp-gateway. Accessed 2025-09-02. 2025
work page 2025
-
[25]
Zhihao Li et al.We Urgently Need Privilege Management in MCP: A Measurement of API Usage in MCP Ecosystems
- [26]
-
[27]
Jailbreaking ChatGPT via Prompt Engineering: An Empirical Study
Yi Liu et al. “Jailbreaking ChatGPT via Prompt Engineering: An Empirical Study”. In:CoRRabs/2305.13860 (2023). doi: 10.48550/ARXIV.2305.13860. arXiv: 2305.13860.url: https://doi.org/10.48550/arXiv.2305.13860
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2305.13860 2023
-
[28]
In 32nd European Conference on Object-Oriented Programming (ECOOP 2018)
Petr Maj et al. “The Fault in Our Stars: Designing Reproducible Large-scale Code Analysis Experiments”. In: 38th European Conference on Object-Oriented Programming (ECOOP 2024). Ed. by Jonathan Aldrich and Guido Salvaneschi. Vol. 313. Leibniz International Proceedings in Informatics (LIPIcs). Dagstuhl, Germany: Schloss Dagstuhl – Leibniz-Zentrum für Infor...
-
[29]
https://www.pulsemcp.com/servers
MCP Server Directory: 6010+ updated daily | PulseMCP — pulsemcp.com. https://www.pulsemcp.com/servers. [Accessed 10-09-2025]
work page 2025
-
[30]
https://github.com/MCP- Defender/MCP- Defender
MCP-Defender.MCP-Defender: Desktop app that automatically scans and blocks malicious MCP traffic in AI apps like Cursor, Claude, VS Code and Windsurf. https://github.com/MCP- Defender/MCP- Defender. Accessed 2025-09-02. 2025
work page 2025
-
[31]
ToolFuzz - Automated Agent Tool Testing
Ivan Milev et al. “ToolFuzz - Automated Agent Tool Testing”. In:CoRRabs/2503.04479 (2025).doi: 10.48550/ ARXIV.2503.04479. arXiv: 2503.04479.url: https://doi.org/10.48550/arXiv.2503.04479. Manuscript submitted to ACM Securing AI Agent Execution 21
-
[32]
modelcontextprotocol.io contributors.Specification. Model Context Protocol. June 18, 2025.url: https : / / modelcontextprotocol.io/specification/2025-06-18 (visited on 07/29/2025)
work page 2025
-
[33]
Hypervisors vs. Lightweight Virtualization: A Performance Comparison
Roberto Morabito, Jimmy Kjällman, and Miika Komu. “Hypervisors vs. Lightweight Virtualization: A Performance Comparison”. In:2015 IEEE International Conference on Cloud Engineering. 2015 IEEE International Conference on Cloud Engineering. Mar. 2015, pp. 386–393.doi: 10.1109/IC2E.2015.74
-
[34]
GitHub.url: https://github.com/orgs/community/discussions/166370 (visited on 09/11/2025)
NachoBecerra.Severe Data Loss Caused by GitHub Copilot – Request for Acknowledgment and Compensation· Community·Discussion #166370. GitHub.url: https://github.com/orgs/community/discussions/166370 (visited on 09/11/2025)
work page 2025
-
[35]
Vineeth Sai Narajala and Idan Habler. “Enterprise-Grade Security for the Model Context Protocol (MCP): Frameworks and Mitigation Strategies”. In:CoRRabs/2504.08623 (2025).doi: 10.48550/ARXIV.2504.08623. arXiv: 2504.08623.url: https://doi.org/10.48550/arXiv.2504.08623
-
[36]
OpenAI.Function Calling and Other API Updates. Mar. 13, 2024.url: https://openai.com/index/function-calling- and-other-api-updates/ (visited on 08/21/2025)
work page 2024
-
[37]
An Empirical Study of the Non-Determinism of ChatGPT in Code Generation
Shuyin Ouyang et al. “An Empirical Study of the Non-Determinism of ChatGPT in Code Generation”. In:ACM Trans. Softw. Eng. Methodol.34.2 (2025), 42:1–42:28.doi: 10.1145/3697010.url: https://doi.org/10.1145/3697010
-
[38]
Mcp safety audit: Llms with the model context protocol allow major security exploits,
Brandon Radosevich and John Halloran. “MCP Safety Audit: LLMs with the Model Context Protocol Allow Major Security Exploits”. In:CoRRabs/2504.03767 (2025).doi: 10.48550/ARXIV.2504.03767. arXiv: 2504.03767.url: https://doi.org/10.48550/arXiv.2504.03767
-
[39]
Security of OS-Level Virtualization Technologies
Elena Reshetova et al. “Security of OS-Level Virtualization Technologies”. In:Secure IT Systems. Ed. by Karin Bernsmed and Simone Fischer-Hübner. Cham: Springer International Publishing, 2014, pp. 77–93.isbn: 978-3- 319-11599-3.doi: 10.1007/978-3-319-11599-3_5
-
[40]
riseandignite.MCP-Shield: Security scanner for MCP (Model Context Protocol) servers. https : / / github . com / riseandignite/mcp-shield. Accessed 2025-09-02. 2025
work page 2025
-
[41]
Great, Now Write an Article About That: The Crescendo Multi-Turn LLM Jailbreak Attack
Mark Russinovich, Ahmed Salem, and Ronen Eldan. “Great, Now Write an Article About That: The Crescendo Multi-Turn LLM Jailbreak Attack”. In: 34th USENIX Security Symposium (USENIX Security 25). 2025, pp. 2421– 2440.isbn: 978-1-939133-52-6.url: https : / / www. usenix . org / conference / usenixsecurity25 / presentation / russinovich (visited on 09/09/2025)
work page 2025
- [42]
-
[43]
Unosecur.AI Agent Wiped Live DB: 4 -Step Identity-First Security Plan.url: https://www.unosecur.com/blog/when- an-ai-agent-wipes-a-live-database-identity-first-controls-to-stop-agentic-ai-disasters (visited on 08/21/2025)
work page 2025
-
[44]
Jack Vanlightly.Remediation: What Happens after AI Goes Wrong?Jack Vanlightly. July 28, 2025.url: https://jack- vanlightly.com/blog/2025/7/28/remediation-what-happens-after-ai-goes-wrong (visited on 09/11/2025)
work page 2025
-
[45]
Security of Internet of Agents: Attacks and Countermeasures
Yuntao Wang et al. “Security of Internet of Agents: Attacks and Countermeasures”. In:IEEE Open Journal of the Computer Society(2025), pp. 1–12.doi: 10.1109/OJCS.2025.3589638
-
[46]
A new era in llm security: Exploring security con- cerns in real-world llm-based systems,
Fangzhou Wu et al. “A new era in llm security: Exploring security concerns in real-world llm-based systems”. In: arXiv preprint arXiv:2402.18649(2024)
-
[47]
Openagents: An open platform for language agents in the wild,
Tianbao Xie et al. “OpenAgents: An Open Platform for Language Agents in the Wild”. In:CoRRabs/2310.10634 (2023).doi: 10.48550/ARXIV.2310.10634. arXiv: 2310.10634.url: https://doi.org/10.48550/arXiv.2310.10634. Manuscript submitted to ACM 22 Christoph Bühler, Matteo Biagiola, Luca Di Grazia, and Guido Salvaneschi
-
[48]
Hui Yang, Sifu Yue, and Yunzhong He. “Auto-GPT for Online Decision Making: Benchmarks and Additional Opinions”. In:CoRRabs/2306.02224 (2023).doi: 10.48550/ARXIV.2306.02224. arXiv: 2306.02224.url: https: //doi.org/10.48550/arXiv.2306.02224
-
[49]
Yuntong Zhang et al. “AutoCodeRover: Autonomous Program Improvement”. In:Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. ISSTA 2024. Vienna, Austria: Association for Computing Machinery, 2024, pp. 1592–1604.isbn: 9798400706127.doi: 10.1145/3650212.3680384.url: https://doi.org/10.1145/3650212.3680384. Manuscrip...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.