Recognition: unknown
CapSeal: Capability-Sealed Secret Mediation for Secure Agent Execution
Pith reviewed 2026-05-10 07:42 UTC · model grok-4.3
The pith
CapSeal replaces direct secret exposure in AI agents with a local broker that grants narrowly scoped, non-exportable action capabilities.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CapSeal is a capability-sealed secret mediation architecture that replaces direct secret access with constrained invocations through a local trusted broker, combining capability issuance, schema-constrained HTTP execution, broker-executed SSH actions, anti-replay session binding, policy evaluation, and tamper-evident audit trails; the system reframes secret handling for agentic systems from handing the model a key to granting the model a narrowly scoped, non-exportable action capability.
What carries the argument
The local trusted broker that issues and enforces non-exportable capabilities for secret-using actions while keeping the raw secrets hidden from the agent process.
If this is right
- Agents can invoke API endpoints using secrets without the model ever receiving the credential string.
- SSH commands can be executed through the broker under policy control rather than direct agent access to keys.
- Replay attacks on issued capabilities are blocked by session binding and anti-replay checks.
- All mediated actions produce tamper-evident audit records that survive later inspection.
- Security properties of non-disclosure, constrained use, replay resistance, and auditability hold under the stated threat model.
Where Pith is reading between the lines
- The broker model could extend to database credentials or OAuth tokens by defining additional action schemas.
- Integration with common agent runtimes would require only an adapter layer rather than changes to the model itself.
- Policy language expressiveness would determine how finely actions can be scoped in practice.
Load-bearing premise
The local trusted broker remains uncompromised and correctly implements capability issuance, policy checks, and anti-replay mechanisms without creating new attack surfaces.
What would settle it
A test in which an agent process under prompt injection or tool misuse succeeds in extracting a raw secret value or executing an unauthorized action despite the broker's mediation.
Figures
read the original abstract
Modern AI agents routinely depend on secrets such as API keys and SSH credentials, yet the dominant deployment model still exposes those secrets directly to the agent process through environment variables, local files, or forwarding sockets. This design fails against prompt injection, tool misuse, and model-controlled exfiltration because the agent can both use and reveal the same bearer credential. We present CapSeal, a capability-sealed secret mediation architecture that replaces direct secret access with constrained invocations through a local trusted broker. CapSeal combines capability issuance, schema-constrained HTTP execution, broker-executed SSH actions, anti-replay session binding, policy evaluation, and tamper-evident audit trails. We describe a Rust prototype integrated with an MCP-facing adapter, formulate conditional security goals for non-disclosure, constrained use, replay resistance, and auditability, and define an evaluation plan spanning prompt injection, tool misuse, and SSH abuse. The resulting system reframes secret handling for agentic systems from handing the model a key to granting the model a narrowly scoped, non-exportable action capability.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes CapSeal, a capability-sealed secret mediation architecture for AI agents that replaces direct exposure of bearer secrets (API keys, SSH credentials) with narrowly scoped, non-exportable action capabilities mediated by a local trusted broker. The design integrates capability issuance, schema-constrained HTTP/SSH execution, anti-replay session binding, policy evaluation, and tamper-evident logging. The manuscript describes a Rust prototype with an MCP-facing adapter, formulates conditional security goals for non-disclosure, constrained use, replay resistance, and auditability, and outlines an evaluation plan covering prompt injection, tool misuse, and SSH abuse.
Significance. If the broker assumptions hold and the mechanisms are shown to enforce the claimed properties, the reframing from bearer-secret sharing to scoped capability granting could meaningfully reduce exfiltration risks in agentic systems. The concrete Rust prototype and explicit evaluation plan are positive elements that ground the design in implementable terms.
major comments (3)
- [Abstract / Security Goals] Abstract and Security Goals section: the central claim that CapSeal achieves non-disclosure and non-exportability rests on the broker correctly performing issuance, schema checks, anti-replay binding, and policy enforcement, yet the manuscript supplies only conditional goal statements with no formal model, machine-checked proofs, or threat-model validation.
- [Evaluation Plan] Evaluation Plan section: the plan is defined for prompt injection, tool misuse, and SSH abuse but no experimental results, attack simulations, or measurements are reported, leaving the assertion that capabilities remain non-exportable as an untested design property rather than a demonstrated one.
- [Prototype Implementation] Prototype Implementation section: the trusted broker is described as local and uncompromised, but there is no analysis of how its implementation avoids introducing new attack surfaces or how its correctness was established, which is load-bearing for all security claims.
minor comments (2)
- [Design] Figure 1 (architecture diagram) would benefit from explicit labeling of the capability token format and the exact points where schema validation and anti-replay checks occur.
- [Related Work] The manuscript would be strengthened by adding citations to prior capability-based systems (e.g., object-capability literature) and recent agent-security papers on prompt injection defenses.
Simulated Author's Rebuttal
We thank the referee for the thoughtful review and the recommendation for major revision. We address each of the major comments below, clarifying the scope of our work as an architectural proposal and outlining planned revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract / Security Goals] Abstract and Security Goals section: the central claim that CapSeal achieves non-disclosure and non-exportability rests on the broker correctly performing issuance, schema checks, anti-replay binding, and policy enforcement, yet the manuscript supplies only conditional goal statements with no formal model, machine-checked proofs, or threat-model validation.
Authors: We acknowledge that the security properties are presented conditionally on the trusted broker. This manuscript proposes an architecture and prototype rather than a formally verified system. In the revision, we will enhance the Security Goals section with a more detailed informal threat model, explicitly stating the broker assumptions and the conditional nature of the claims. We note that machine-checked proofs are outside the current scope. revision: partial
-
Referee: [Evaluation Plan] Evaluation Plan section: the plan is defined for prompt injection, tool misuse, and SSH abuse but no experimental results, attack simulations, or measurements are reported, leaving the assertion that capabilities remain non-exportable as an untested design property rather than a demonstrated one.
Authors: The manuscript describes the design, a Rust prototype, and an evaluation plan but does not include results, as the focus is on the proposed system. We will revise the Evaluation Plan section to clearly indicate that the evaluations are planned for future work and add a note on the current status of the prototype. Any basic tests performed during development can be summarized if appropriate. revision: partial
-
Referee: [Prototype Implementation] Prototype Implementation section: the trusted broker is described as local and uncompromised, but there is no analysis of how its implementation avoids introducing new attack surfaces or how its correctness was established, which is load-bearing for all security claims.
Authors: We agree that additional analysis is needed. In the revised version, we will include a discussion of potential attack surfaces in the broker implementation, such as risks from policy misconfiguration or code vulnerabilities, and how the design and Rust's safety features help mitigate them. We will also state that the prototype's correctness relies on implementation testing rather than formal methods. revision: yes
Circularity Check
No circularity: system design with no derivations, equations, or self-referential predictions
full rationale
The paper describes an architectural system (CapSeal) for replacing bearer secrets with scoped capabilities via a trusted broker. It lists components (capability issuance, schema-constrained HTTP/SSH execution, anti-replay binding, policy evaluation, audit trails) and states conditional security goals plus an evaluation plan, but supplies no equations, fitted parameters, first-principles derivations, or predictions. No load-bearing steps reduce to self-definition, fitted inputs renamed as predictions, or self-citation chains. The contribution is a design reframing whose correctness claims are independent of any internal mathematical reduction and rest on external assumptions about the broker.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption A local trusted broker can be maintained outside the agent's control and will correctly enforce issued capabilities.
invented entities (1)
-
Capability-sealed secret mediation broker
no independent evidence
Forward citations
Cited by 1 Pith paper
-
The Granularity Mismatch in Agent Security: Argument-Level Provenance Solves Enforcement and Isolates the LLM Reasoning Bottleneck
PACT achieves perfect security and utility under oracle provenance by enforcing argument-level trust contracts based on semantic roles and cross-step provenance tracking, outperforming invocation-level monitors in Age...
Reference graph
Works this paper leans on
-
[1]
Not what you’ve signed up for: Compromising real-world llm- integrated applications with indirect prompt injection,
K. Greshake, S. Abdelnabi, S. Mishra, C. Endres, T. Holz, and M. Fritz, “Not what you’ve signed up for: Compromising real-world llm- integrated applications with indirect prompt injection,” inProceedings of the 16th ACM Workshop on Artificial Intelligence and Security, ser. CCS ’23. ACM, Nov. 2023, pp. 79–90
2023
-
[2]
Prompt infection: Llm-to-llm prompt injection within multi-agent systems,
D. Lee and M. Tiwari, “Prompt infection: Llm-to-llm prompt injection within multi-agent systems,” 2024
2024
-
[3]
Agentpoison: Red- teaming llm agents via poisoning memory or knowledge bases,
Z. Chen, B. Li, D. Song, Z. Xiang, and C. Xiao, “Agentpoison: Red- teaming llm agents via poisoning memory or knowledge bases,” inAd- vances in Neural Information Processing Systems 37, ser. NeurIPS 2024. Neural Information Processing Systems Foundation, Inc. (NeurIPS), 2024, pp. 130 185–130 213
2024
-
[4]
Badagent: Inserting and activating backdoor attacks in llm agents,
Y . Wang, D. Xue, S. Zhang, and S. Qian, “Badagent: Inserting and activating backdoor attacks in llm agents,” inProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Vol- ume 1: Long Papers). Association for Computational Linguistics, 2024, pp. 9811–9827
2024
-
[5]
Model context protocol (MCP): Landscape, security threats, and future research directions,
X. Hou, Y . Zhao, S. Wang, and H. Wang, “Model context protocol (MCP): Landscape, security threats, and future research directions,” ACM Transactions on Software Engineering and Methodology, Feb. 2026
2026
-
[6]
MCPTox: A benchmark for tool poisoning on real-world MCP servers,
Z. Wang, Y . Gao, Y . Wang, S. Liu, H. Sun, H. Cheng, G. Shi, H. Du, and X. Li, “MCPTox: A benchmark for tool poisoning on real-world MCP servers,”Proceedings of the AAAI Conference on Artificial Intelligence, vol. 40, no. 42, pp. 35 811–35 819, Mar. 2026
2026
-
[7]
Research directions in software supply chain security,
L. Williams, G. Benedetti, S. Hamer, R. Paramitha, I. Rahman, M. Tamanna, G. Tystahl, N. Zahan, P. Morrison, Y . Acar, M. Cukier, C. K ¨astner, A. Kapravelos, D. Wermke, and W. Enck, “Research directions in software supply chain security,”ACM Transactions on Software Engineering and Methodology, vol. 34, no. 5, pp. 1–38, May 2025
2025
-
[8]
Wolves in the repository: A software engi- neering analysis of the xz utils supply chain attack,
P. Przymus and T. Durieux, “Wolves in the repository: A software engi- neering analysis of the xz utils supply chain attack,” in2025 IEEE/ACM 22nd International Conference on Mining Software Repositories (MSR). IEEE, Apr. 2025, pp. 91–102
2025
-
[9]
Efficient data structures for tamper- evident logging,
S. A. Crosby and D. S. Wallach, “Efficient data structures for tamper- evident logging,” inUSENIX Security Symposium, 2009
2009
-
[10]
The oauth 2.0 authorization framework: Bearer token usage,
IETF, “The oauth 2.0 authorization framework: Bearer token usage,” RFC 6750, 2012. [Online]. Available: https://www.rfc-editor.org/rfc/ rfc6750
2012
-
[11]
Personal LLM agents: Insights and survey about the capability, efficiency and security,
Y . Li, H. Wen, W. Wang, X. Li, Y . Yuan, G. Liu, J. Chen, W. Yao, X. Fu, M. Liuet al., “Personal LLM agents: Insights and survey about the capability, efficiency and security,” 2024
2024
-
[12]
Model context protocol specification,
Anthropic, “Model context protocol specification,” 2025. [Online]. Available: https://modelcontextprotocol.io/specification
2025
-
[13]
Prompt injec- tion attack to tool selection in llm agents.arXiv preprint arXiv:2504.19793, 2025
J. Shi, Z. Yuan, G. Tie, P. Zhou, N. Z. Gong, and L. Sun, “Prompt injection attack to tool selection in llm agents,” 2025, arXiv preprint. [Online]. Available: https://arxiv.org/abs/2504.19793
-
[14]
arXiv preprint arXiv:2508.15310
H. An, J. Zhang, T. Du, C. Yu, W. Wang, Y . Li, H. Zhang, J. Zhou, J. Huang, and Y . Zhuge, “IPIGuard: A novel tool dependency graph- based defense against indirect prompt injection in LLM agents,” in Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025, arXiv:2508.15310
-
[15]
F. Jia, T. Wu, X. Qin, G. Liu, S. Yang, M. Zhao, Y . Liu, S. Ding, X. Li, J. Huang, X. Liu, and L. Sun, “The task shield: Enforcing task alignment to defend against indirect prompt injection in LLM agents,” inAnnual Meeting of the Association for Computational Linguistics (ACL), 2024, arXiv:2412.16682
-
[16]
MELON: Provable defense against indirect prompt injection attacks in AI agents,
K. Zhu, X. Yang, J. Wang, X. Yan, H. Qi, Y . Chen, L. Ye, Y . Xie, Y . Mao, Y . Wang, B. Zhou, Y . Chen, J. Leskovec, X. Xie, Y . Zhang, and W. Zhou, “MELON: Provable defense against indirect prompt injection attacks in AI agents,” inInternational Conference on Machine Learning (ICML), 2025
2025
-
[17]
Vault transit secrets engine,
HashiCorp, “Vault transit secrets engine,” 2025. [Online]. Available: https://developer.hashicorp.com/vault/docs/secrets/transit
2025
-
[18]
Aws key management service developer guide,
Amazon Web Services, “Aws key management service developer guide,” 2025. [Online]. Available: https://docs.aws.amazon.com/kms/
2025
-
[19]
Certificate transparency version 2.0,
IETF, “Certificate transparency version 2.0,” RFC 9162, 2021. [Online]. Available: https://www.rfc-editor.org/rfc/rfc9162
2021
-
[20]
Json type definition,
“Json type definition,” RFC 8927, 2020. [Online]. Available: https: //www.rfc-editor.org/rfc/rfc8927
2020
-
[21]
ssh config,
“ssh config,” 2025. [Online]. Available: https://man.openbsd.org/ssh config
2025
-
[22]
ssh-add,
“ssh-add,” 2025. [Online]. Available: https://man.openbsd.org/ssh-add
2025
-
[23]
Defeating Prompt Injections by Design
E. Debenedetti, I. Shumailov, T. Fan, J. Hayes, N. Carlini, D. Fabian, C. Kern, C. Shi, A. Terzis, and F. Tram `er, “Defeating prompt injections by design,” 2025. [Online]. Available: https://arxiv.org/abs/2503.18813
work page internal anchor Pith review arXiv 2025
-
[24]
Clawkeeper: Comprehensive safety protection for openclaw agents through skills, plugins, and watchers,
S. Liu, C. Li, C. Wang, J. Hou, Z. Chen, L. Zhang, Z. Liu, Q. Ye, Y . Hei, X. Zhang, and Z. Wang, “Clawkeeper: Comprehensive safety protection for openclaw agents through skills, plugins, and watchers,”
-
[25]
[Online]. Available: https://arxiv.org/abs/2603.24414
-
[26]
Capability myths demolished,
M. S. Miller, K.-P. Yee, and J. S. Shapiro, “Capability myths demolished,”Technical Report, 2003. [Online]. Available: http: //www.erights.org/talks/myths/
2003
-
[27]
Macaroons: Cookies with contextual caveats for decen- tralized authorization in the cloud,
A. Birgisson, J. G. Politz, U. Erlingsson, A. Taly, M. Vrable, and M. Lentczner, “Macaroons: Cookies with contextual caveats for decen- tralized authorization in the cloud,” inProceedings of the Network and Distributed System Security Symposium (NDSS), 2014
2014
-
[28]
ZKP-CapBAC: Capability-based access control via on-chain zero-knowledge proofs for cross-domain hiding delegation tree,
Y . Chen, Y . Zhang, and X. Lin, “ZKP-CapBAC: Capability-based access control via on-chain zero-knowledge proofs for cross-domain hiding delegation tree,” inIEEE International Conference on Computer Communications (INFOCOM), 2025
2025
-
[29]
Reasoning about object capabilities with logical relations and effect parametricity,
D. Devriese, L. Birkedal, and F. Piessens, “Reasoning about object capabilities with logical relations and effect parametricity,” inIEEE European Symposium on Security and Privacy (EuroS&P), 2016
2016
-
[30]
Right to history: A sovereignty kernel for verifiable AI agent execution,
J. Zhang, “Right to history: A sovereignty kernel for verifiable AI agent execution,” 2026
2026
-
[31]
The transport layer security (tls) protocol version 1.3,
“The transport layer security (tls) protocol version 1.3,” RFC 8446,
-
[32]
Available: https://www.rfc-editor.org/rfc/rfc8446
[Online]. Available: https://www.rfc-editor.org/rfc/rfc8446
-
[33]
Oauth 2.0 mutual-tls client authentication and certificate-bound access tokens,
“Oauth 2.0 mutual-tls client authentication and certificate-bound access tokens,” RFC 8705, 2020. [Online]. Available: https://www.rfc-editor. org/rfc/rfc8705
2020
-
[34]
Channel bindings for tls 1.3,
“Channel bindings for tls 1.3,” RFC 9266, 2022. [Online]. Available: https://www.rfc-editor.org/rfc/rfc9266
2022
-
[35]
Enhancing microservices security with token-based access control method,
A. Ven ˇckauskas, D. Kukta, and v. Grigali¯unas, “Enhancing microservices security with token-based access control method,”Sensors, vol. 23, no. 6, p. 3363, 2023
2023
-
[36]
Owasp top 10 for llm applications,
OW ASP, “Owasp top 10 for llm applications,” 2025. [Online]. Available: https://genai.owasp.org/
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.