arxiv: 2604.16762 · v1 · submitted 2026-04-18 · 💻 cs.CR · cs.AI

Recognition: unknown

CapSeal: Capability-Sealed Secret Mediation for Secure Agent Execution

Ray C. C. Cheung, Ruiyi Guo, Shutong Jin

Authors on Pith no claims yet

Pith reviewed 2026-05-10 07:42 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords capability-based securityAI agent securitysecret mediationprompt injection defenseanti-replay protocolstamper-evident auditinglocal trusted brokerconstrained action invocation

0 comments

The pith

CapSeal replaces direct secret exposure in AI agents with a local broker that grants narrowly scoped, non-exportable action capabilities.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

AI agents need API keys and credentials but current setups hand those secrets straight to the agent process, allowing prompt injection or misuse to leak them. CapSeal introduces a local trusted broker that never reveals the secrets; instead it issues capabilities that let the agent trigger only specific, constrained actions like a particular API call or SSH command. The broker enforces policies, binds sessions to prevent replay, and logs actions in tamper-evident trails. This changes secret handling from giving the model a bearer token it can export to authorizing a single use that stays inside the broker. If the approach holds, agents could perform tasks involving sensitive credentials while the model itself remains blind to the actual values.

Core claim

CapSeal is a capability-sealed secret mediation architecture that replaces direct secret access with constrained invocations through a local trusted broker, combining capability issuance, schema-constrained HTTP execution, broker-executed SSH actions, anti-replay session binding, policy evaluation, and tamper-evident audit trails; the system reframes secret handling for agentic systems from handing the model a key to granting the model a narrowly scoped, non-exportable action capability.

What carries the argument

The local trusted broker that issues and enforces non-exportable capabilities for secret-using actions while keeping the raw secrets hidden from the agent process.

If this is right

Agents can invoke API endpoints using secrets without the model ever receiving the credential string.
SSH commands can be executed through the broker under policy control rather than direct agent access to keys.
Replay attacks on issued capabilities are blocked by session binding and anti-replay checks.
All mediated actions produce tamper-evident audit records that survive later inspection.
Security properties of non-disclosure, constrained use, replay resistance, and auditability hold under the stated threat model.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The broker model could extend to database credentials or OAuth tokens by defining additional action schemas.
Integration with common agent runtimes would require only an adapter layer rather than changes to the model itself.
Policy language expressiveness would determine how finely actions can be scoped in practice.

Load-bearing premise

The local trusted broker remains uncompromised and correctly implements capability issuance, policy checks, and anti-replay mechanisms without creating new attack surfaces.

What would settle it

A test in which an agent process under prompt injection or tool misuse succeeds in extracting a raw secret value or executing an unauthorized action despite the broker's mediation.

Figures

Figures reproduced from arXiv: 2604.16762 by Ray C. C. Cheung, Ruiyi Guo, Shutong Jin.

**Figure 2.** Figure 2: CapSeal request lifecycle from capability request to broker-mediated execution and audit proof export. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 5.** Figure 5: Visualization of Table IX: median and P95 latency (ms) for HTTP and SSH across Direct, S1, S2, and CapSeal. [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

read the original abstract

Modern AI agents routinely depend on secrets such as API keys and SSH credentials, yet the dominant deployment model still exposes those secrets directly to the agent process through environment variables, local files, or forwarding sockets. This design fails against prompt injection, tool misuse, and model-controlled exfiltration because the agent can both use and reveal the same bearer credential. We present CapSeal, a capability-sealed secret mediation architecture that replaces direct secret access with constrained invocations through a local trusted broker. CapSeal combines capability issuance, schema-constrained HTTP execution, broker-executed SSH actions, anti-replay session binding, policy evaluation, and tamper-evident audit trails. We describe a Rust prototype integrated with an MCP-facing adapter, formulate conditional security goals for non-disclosure, constrained use, replay resistance, and auditability, and define an evaluation plan spanning prompt injection, tool misuse, and SSH abuse. The resulting system reframes secret handling for agentic systems from handing the model a key to granting the model a narrowly scoped, non-exportable action capability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes CapSeal, a capability-sealed secret mediation architecture for AI agents that replaces direct exposure of bearer secrets (API keys, SSH credentials) with narrowly scoped, non-exportable action capabilities mediated by a local trusted broker. The design integrates capability issuance, schema-constrained HTTP/SSH execution, anti-replay session binding, policy evaluation, and tamper-evident logging. The manuscript describes a Rust prototype with an MCP-facing adapter, formulates conditional security goals for non-disclosure, constrained use, replay resistance, and auditability, and outlines an evaluation plan covering prompt injection, tool misuse, and SSH abuse.

Significance. If the broker assumptions hold and the mechanisms are shown to enforce the claimed properties, the reframing from bearer-secret sharing to scoped capability granting could meaningfully reduce exfiltration risks in agentic systems. The concrete Rust prototype and explicit evaluation plan are positive elements that ground the design in implementable terms.

major comments (3)

[Abstract / Security Goals] Abstract and Security Goals section: the central claim that CapSeal achieves non-disclosure and non-exportability rests on the broker correctly performing issuance, schema checks, anti-replay binding, and policy enforcement, yet the manuscript supplies only conditional goal statements with no formal model, machine-checked proofs, or threat-model validation.
[Evaluation Plan] Evaluation Plan section: the plan is defined for prompt injection, tool misuse, and SSH abuse but no experimental results, attack simulations, or measurements are reported, leaving the assertion that capabilities remain non-exportable as an untested design property rather than a demonstrated one.
[Prototype Implementation] Prototype Implementation section: the trusted broker is described as local and uncompromised, but there is no analysis of how its implementation avoids introducing new attack surfaces or how its correctness was established, which is load-bearing for all security claims.

minor comments (2)

[Design] Figure 1 (architecture diagram) would benefit from explicit labeling of the capability token format and the exact points where schema validation and anti-replay checks occur.
[Related Work] The manuscript would be strengthened by adding citations to prior capability-based systems (e.g., object-capability literature) and recent agent-security papers on prompt injection defenses.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful review and the recommendation for major revision. We address each of the major comments below, clarifying the scope of our work as an architectural proposal and outlining planned revisions to strengthen the manuscript.

read point-by-point responses

Referee: [Abstract / Security Goals] Abstract and Security Goals section: the central claim that CapSeal achieves non-disclosure and non-exportability rests on the broker correctly performing issuance, schema checks, anti-replay binding, and policy enforcement, yet the manuscript supplies only conditional goal statements with no formal model, machine-checked proofs, or threat-model validation.

Authors: We acknowledge that the security properties are presented conditionally on the trusted broker. This manuscript proposes an architecture and prototype rather than a formally verified system. In the revision, we will enhance the Security Goals section with a more detailed informal threat model, explicitly stating the broker assumptions and the conditional nature of the claims. We note that machine-checked proofs are outside the current scope. revision: partial
Referee: [Evaluation Plan] Evaluation Plan section: the plan is defined for prompt injection, tool misuse, and SSH abuse but no experimental results, attack simulations, or measurements are reported, leaving the assertion that capabilities remain non-exportable as an untested design property rather than a demonstrated one.

Authors: The manuscript describes the design, a Rust prototype, and an evaluation plan but does not include results, as the focus is on the proposed system. We will revise the Evaluation Plan section to clearly indicate that the evaluations are planned for future work and add a note on the current status of the prototype. Any basic tests performed during development can be summarized if appropriate. revision: partial
Referee: [Prototype Implementation] Prototype Implementation section: the trusted broker is described as local and uncompromised, but there is no analysis of how its implementation avoids introducing new attack surfaces or how its correctness was established, which is load-bearing for all security claims.

Authors: We agree that additional analysis is needed. In the revised version, we will include a discussion of potential attack surfaces in the broker implementation, such as risks from policy misconfiguration or code vulnerabilities, and how the design and Rust's safety features help mitigate them. We will also state that the prototype's correctness relies on implementation testing rather than formal methods. revision: yes

Circularity Check

0 steps flagged

No circularity: system design with no derivations, equations, or self-referential predictions

full rationale

The paper describes an architectural system (CapSeal) for replacing bearer secrets with scoped capabilities via a trusted broker. It lists components (capability issuance, schema-constrained HTTP/SSH execution, anti-replay binding, policy evaluation, audit trails) and states conditional security goals plus an evaluation plan, but supplies no equations, fitted parameters, first-principles derivations, or predictions. No load-bearing steps reduce to self-definition, fitted inputs renamed as predictions, or self-citation chains. The contribution is a design reframing whose correctness claims are independent of any internal mathematical reduction and rest on external assumptions about the broker.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The design depends on the assumption that a local broker can be kept trustworthy and that capability enforcement prevents exfiltration; no free parameters or new physical entities are introduced.

axioms (1)

domain assumption A local trusted broker can be maintained outside the agent's control and will correctly enforce issued capabilities.
Invoked throughout the architecture description as the foundation for non-disclosure and constrained use.

invented entities (1)

Capability-sealed secret mediation broker no independent evidence
purpose: To mediate all secret-using actions without exposing bearer credentials to the agent process.
The central new component introduced by the paper; no independent evidence such as a formal proof or external validation is provided.

pith-pipeline@v0.9.0 · 5479 in / 1245 out tokens · 59692 ms · 2026-05-10T07:42:40.440193+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

The Granularity Mismatch in Agent Security: Argument-Level Provenance Solves Enforcement and Isolates the LLM Reasoning Bottleneck
cs.CR 2026-05 unverdicted novelty 7.0

PACT achieves perfect security and utility under oracle provenance by enforcing argument-level trust contracts based on semantic roles and cross-step provenance tracking, outperforming invocation-level monitors in Age...

Reference graph

Works this paper leans on

36 extracted references · 5 canonical work pages · cited by 1 Pith paper · 1 internal anchor

[1]

Not what you’ve signed up for: Compromising real-world llm- integrated applications with indirect prompt injection,

K. Greshake, S. Abdelnabi, S. Mishra, C. Endres, T. Holz, and M. Fritz, “Not what you’ve signed up for: Compromising real-world llm- integrated applications with indirect prompt injection,” inProceedings of the 16th ACM Workshop on Artificial Intelligence and Security, ser. CCS ’23. ACM, Nov. 2023, pp. 79–90

2023
[2]

Prompt infection: Llm-to-llm prompt injection within multi-agent systems,

D. Lee and M. Tiwari, “Prompt infection: Llm-to-llm prompt injection within multi-agent systems,” 2024

2024
[3]

Agentpoison: Red- teaming llm agents via poisoning memory or knowledge bases,

Z. Chen, B. Li, D. Song, Z. Xiang, and C. Xiao, “Agentpoison: Red- teaming llm agents via poisoning memory or knowledge bases,” inAd- vances in Neural Information Processing Systems 37, ser. NeurIPS 2024. Neural Information Processing Systems Foundation, Inc. (NeurIPS), 2024, pp. 130 185–130 213

2024
[4]

Badagent: Inserting and activating backdoor attacks in llm agents,

Y . Wang, D. Xue, S. Zhang, and S. Qian, “Badagent: Inserting and activating backdoor attacks in llm agents,” inProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Vol- ume 1: Long Papers). Association for Computational Linguistics, 2024, pp. 9811–9827

2024
[5]

Model context protocol (MCP): Landscape, security threats, and future research directions,

X. Hou, Y . Zhao, S. Wang, and H. Wang, “Model context protocol (MCP): Landscape, security threats, and future research directions,” ACM Transactions on Software Engineering and Methodology, Feb. 2026

2026
[6]

MCPTox: A benchmark for tool poisoning on real-world MCP servers,

Z. Wang, Y . Gao, Y . Wang, S. Liu, H. Sun, H. Cheng, G. Shi, H. Du, and X. Li, “MCPTox: A benchmark for tool poisoning on real-world MCP servers,”Proceedings of the AAAI Conference on Artificial Intelligence, vol. 40, no. 42, pp. 35 811–35 819, Mar. 2026

2026
[7]

Research directions in software supply chain security,

L. Williams, G. Benedetti, S. Hamer, R. Paramitha, I. Rahman, M. Tamanna, G. Tystahl, N. Zahan, P. Morrison, Y . Acar, M. Cukier, C. K ¨astner, A. Kapravelos, D. Wermke, and W. Enck, “Research directions in software supply chain security,”ACM Transactions on Software Engineering and Methodology, vol. 34, no. 5, pp. 1–38, May 2025

2025
[8]

Wolves in the repository: A software engi- neering analysis of the xz utils supply chain attack,

P. Przymus and T. Durieux, “Wolves in the repository: A software engi- neering analysis of the xz utils supply chain attack,” in2025 IEEE/ACM 22nd International Conference on Mining Software Repositories (MSR). IEEE, Apr. 2025, pp. 91–102

2025
[9]

Efficient data structures for tamper- evident logging,

S. A. Crosby and D. S. Wallach, “Efficient data structures for tamper- evident logging,” inUSENIX Security Symposium, 2009

2009
[10]

The oauth 2.0 authorization framework: Bearer token usage,

IETF, “The oauth 2.0 authorization framework: Bearer token usage,” RFC 6750, 2012. [Online]. Available: https://www.rfc-editor.org/rfc/ rfc6750

2012
[11]

Personal LLM agents: Insights and survey about the capability, efficiency and security,

Y . Li, H. Wen, W. Wang, X. Li, Y . Yuan, G. Liu, J. Chen, W. Yao, X. Fu, M. Liuet al., “Personal LLM agents: Insights and survey about the capability, efficiency and security,” 2024

2024
[12]

Model context protocol specification,

Anthropic, “Model context protocol specification,” 2025. [Online]. Available: https://modelcontextprotocol.io/specification

2025
[13]

Prompt injec- tion attack to tool selection in llm agents.arXiv preprint arXiv:2504.19793, 2025

J. Shi, Z. Yuan, G. Tie, P. Zhou, N. Z. Gong, and L. Sun, “Prompt injection attack to tool selection in llm agents,” 2025, arXiv preprint. [Online]. Available: https://arxiv.org/abs/2504.19793

work page arXiv 2025
[14]

arXiv preprint arXiv:2508.15310

H. An, J. Zhang, T. Du, C. Yu, W. Wang, Y . Li, H. Zhang, J. Zhou, J. Huang, and Y . Zhuge, “IPIGuard: A novel tool dependency graph- based defense against indirect prompt injection in LLM agents,” in Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025, arXiv:2508.15310

work page arXiv 2025
[15]

Task shield: Enforcing task alignment to defend against indirect prompt injection in LLM agents.arXiv preprint arXiv:2412.16682, 2024

F. Jia, T. Wu, X. Qin, G. Liu, S. Yang, M. Zhao, Y . Liu, S. Ding, X. Li, J. Huang, X. Liu, and L. Sun, “The task shield: Enforcing task alignment to defend against indirect prompt injection in LLM agents,” inAnnual Meeting of the Association for Computational Linguistics (ACL), 2024, arXiv:2412.16682

work page arXiv 2024
[16]

MELON: Provable defense against indirect prompt injection attacks in AI agents,

K. Zhu, X. Yang, J. Wang, X. Yan, H. Qi, Y . Chen, L. Ye, Y . Xie, Y . Mao, Y . Wang, B. Zhou, Y . Chen, J. Leskovec, X. Xie, Y . Zhang, and W. Zhou, “MELON: Provable defense against indirect prompt injection attacks in AI agents,” inInternational Conference on Machine Learning (ICML), 2025

2025
[17]

Vault transit secrets engine,

HashiCorp, “Vault transit secrets engine,” 2025. [Online]. Available: https://developer.hashicorp.com/vault/docs/secrets/transit

2025
[18]

Aws key management service developer guide,

Amazon Web Services, “Aws key management service developer guide,” 2025. [Online]. Available: https://docs.aws.amazon.com/kms/

2025
[19]

Certificate transparency version 2.0,

IETF, “Certificate transparency version 2.0,” RFC 9162, 2021. [Online]. Available: https://www.rfc-editor.org/rfc/rfc9162

2021
[20]

Json type definition,

“Json type definition,” RFC 8927, 2020. [Online]. Available: https: //www.rfc-editor.org/rfc/rfc8927

2020
[21]

ssh config,

“ssh config,” 2025. [Online]. Available: https://man.openbsd.org/ssh config

2025
[22]

ssh-add,

“ssh-add,” 2025. [Online]. Available: https://man.openbsd.org/ssh-add

2025
[23]

Defeating Prompt Injections by Design

E. Debenedetti, I. Shumailov, T. Fan, J. Hayes, N. Carlini, D. Fabian, C. Kern, C. Shi, A. Terzis, and F. Tram `er, “Defeating prompt injections by design,” 2025. [Online]. Available: https://arxiv.org/abs/2503.18813

work page internal anchor Pith review arXiv 2025
[24]

Clawkeeper: Comprehensive safety protection for openclaw agents through skills, plugins, and watchers,

S. Liu, C. Li, C. Wang, J. Hou, Z. Chen, L. Zhang, Z. Liu, Q. Ye, Y . Hei, X. Zhang, and Z. Wang, “Clawkeeper: Comprehensive safety protection for openclaw agents through skills, plugins, and watchers,”
[25]

Clawkeeper: Comprehensive safety protection for openclaw agents through skills, plugins, and watchers.arXiv preprint arXiv:2603.24414,

[Online]. Available: https://arxiv.org/abs/2603.24414

work page arXiv
[26]

Capability myths demolished,

M. S. Miller, K.-P. Yee, and J. S. Shapiro, “Capability myths demolished,”Technical Report, 2003. [Online]. Available: http: //www.erights.org/talks/myths/

2003
[27]

Macaroons: Cookies with contextual caveats for decen- tralized authorization in the cloud,

A. Birgisson, J. G. Politz, U. Erlingsson, A. Taly, M. Vrable, and M. Lentczner, “Macaroons: Cookies with contextual caveats for decen- tralized authorization in the cloud,” inProceedings of the Network and Distributed System Security Symposium (NDSS), 2014

2014
[28]

ZKP-CapBAC: Capability-based access control via on-chain zero-knowledge proofs for cross-domain hiding delegation tree,

Y . Chen, Y . Zhang, and X. Lin, “ZKP-CapBAC: Capability-based access control via on-chain zero-knowledge proofs for cross-domain hiding delegation tree,” inIEEE International Conference on Computer Communications (INFOCOM), 2025

2025
[29]

Reasoning about object capabilities with logical relations and effect parametricity,

D. Devriese, L. Birkedal, and F. Piessens, “Reasoning about object capabilities with logical relations and effect parametricity,” inIEEE European Symposium on Security and Privacy (EuroS&P), 2016

2016
[30]

Right to history: A sovereignty kernel for verifiable AI agent execution,

J. Zhang, “Right to history: A sovereignty kernel for verifiable AI agent execution,” 2026

2026
[31]

The transport layer security (tls) protocol version 1.3,

“The transport layer security (tls) protocol version 1.3,” RFC 8446,
[32]

Available: https://www.rfc-editor.org/rfc/rfc8446

[Online]. Available: https://www.rfc-editor.org/rfc/rfc8446
[33]

Oauth 2.0 mutual-tls client authentication and certificate-bound access tokens,

“Oauth 2.0 mutual-tls client authentication and certificate-bound access tokens,” RFC 8705, 2020. [Online]. Available: https://www.rfc-editor. org/rfc/rfc8705

2020
[34]

Channel bindings for tls 1.3,

“Channel bindings for tls 1.3,” RFC 9266, 2022. [Online]. Available: https://www.rfc-editor.org/rfc/rfc9266

2022
[35]

Enhancing microservices security with token-based access control method,

A. Ven ˇckauskas, D. Kukta, and v. Grigali¯unas, “Enhancing microservices security with token-based access control method,”Sensors, vol. 23, no. 6, p. 3363, 2023

2023
[36]

Owasp top 10 for llm applications,

OW ASP, “Owasp top 10 for llm applications,” 2025. [Online]. Available: https://genai.owasp.org/

2025