pith. sign in

arxiv: 2606.22560 · v1 · pith:D7IOXRK4new · submitted 2026-06-21 · 💻 cs.CR

Evidence-Bound Gateway-Path Provenance for Third-Party LLM Inference

Pith reviewed 2026-06-26 10:05 UTC · model grok-4.3

classification 💻 cs.CR
keywords LLM gatewayattested executionprovenanceenclavesthird-party inferencerouting evidencesecure mediation
0
0 comments X

The pith

An attested gateway runtime inside a hardware enclave produces signed evidence binding the exact LLM routing path, fallback decisions, and stream commitments to its measurement.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents an architecture that splits LLM gateway operation into an operator-controlled plane and a measured Attested Gateway Runtime inside an enclave. The runtime alone decrypts client requests, applies routing policy, calls upstream providers, and signs evidence that records the policy, chosen route, endpoint, and stream details. Clients check the attestation and metadata first, then encrypt requests under keys tied to the runtime measurement, so any tampering outside the enclave produces detectable mismatches or fail-closed behavior. A Rust implementation on AWS Nitro Enclaves demonstrates the mechanism with modest overhead.

Core claim

The evidence-bound LLM gateway architecture separates the operator control plane from an attested execution plane. Within the gateway, a measured Attested Gateway Runtime (AGR) is the only component allowed to decrypt requests, enforce path policy, construct upstream calls, and sign evidence. Clients verify signed release metadata and fresh attestation before encrypting requests to keys bound to the AGR measurement. AGR enforces request-scoped routing, fallback, and endpoint constraints, invokes admitted providers, returns encrypted response streams, and signs evidence binding the policy, selected route, endpoint identity, stream commitments, and completion metadata to the attested runtime.

What carries the argument

Attested Gateway Runtime (AGR), the sole enclave-resident component permitted to decrypt requests, enforce routing policy, invoke providers, and sign binding evidence.

If this is right

  • Clients independently confirm the provider, model, fallback status, and stream integrity without trusting the gateway operator.
  • Tampering with routing decisions or evidence outside the AGR produces fail-closed detection.
  • Usage records can be cryptographically bound to the attested path and endpoint.
  • Request-scoped constraints on endpoints and policies are enforced inside the attested runtime.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same separation could support verifiable mediation for other third-party AI services beyond LLMs.
  • Widespread use might enable regulatory audit of routing choices without requiring direct client-to-provider links.
  • The approach may generalize to additional trusted execution environments beyond the AWS Nitro prototype.

Load-bearing premise

Hardware enclave measurement and isolation ensure the operator cannot access or alter the AGR's keys, policy enforcement, or evidence signing.

What would settle it

A case in which the gateway operator delivers an altered route or manipulated stream yet the client accepts the accompanying signed evidence as valid after attestation check.

Figures

Figures reproduced from arXiv: 2606.22560 by Fei Wang, Zebai Tian.

Figure 1
Figure 1. Figure 1: Request processing and evidence verification. Solid and dashed arrows show request and response paths; [E2E], [TLS], [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Local deterministic mock mechanism probe across concurrency levels. The figure isolates latency and first-content [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
read the original abstract

Third-party LLM gateways have become a critical infrastructure layer between applications and external LLM providers. Conventional gateways do more than forward traffic: they decide which provider and model are called, whether fallback occurred, which stream is delivered, and what usage record should be billed. Because these decisions and records are authored inside the operator-controlled service, clients cannot independently distinguish honest mediation from route substitution, hidden fallback, stream manipulation, or forged provenance. We present an evidence-bound LLM gateway architecture that separates the operator control plane from an attested execution plane. Within the gateway, a measured Attested Gateway Runtime (AGR) is the only component allowed to decrypt requests, enforce path policy, construct upstream calls, and sign evidence. Clients verify signed release metadata and fresh attestation before encrypting requests to keys bound to the AGR measurement. AGR enforces request-scoped routing, fallback, and endpoint constraints, invokes admitted providers, returns encrypted response streams, and signs evidence binding the policy, selected route, endpoint identity, stream commitments, and completion metadata to the attested runtime. An initial Rust prototype on AWS Nitro Enclaves shows modest mechanism overhead and fail-closed detection of policy, routing, endpoint, and stream-evidence tampering outside the attested runtime.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper proposes an evidence-bound LLM gateway architecture separating the operator control plane from an attested execution plane. An Attested Gateway Runtime (AGR) inside an enclave (AWS Nitro Enclaves) is the sole component permitted to decrypt requests, enforce routing/policy/endpoint constraints, invoke providers, and sign evidence binding policy, route, endpoint identity, stream commitments, and metadata. Clients verify signed release metadata and fresh attestation before encrypting requests to measurement-bound keys. A Rust prototype demonstrates modest overhead and fail-closed detection of tampering outside the attested runtime.

Significance. If the claims hold, the work offers a practical mechanism for verifiable provenance in third-party LLM gateways, addressing route substitution, hidden fallbacks, stream manipulation, and forged records. The design applies standard attested-execution primitives (measurement-bound keys, isolation) in a new domain. The prototype provides initial feasibility evidence. This is a solid systems contribution for a timely problem, though its impact depends on stronger validation of the security properties.

major comments (1)
  1. [Abstract] Abstract (paragraph on AGR and prototype): the central claim that the architecture enables clients to detect policy/routing/endpoint/stream tampering rests on the unelaborated assumption that enclave measurement and isolation suffice; no threat model, formal argument, or empirical tampering-detection results are supplied to substantiate fail-closed behavior beyond the high-level prototype statement.
minor comments (2)
  1. The manuscript would benefit from explicit section headings and a dedicated evaluation section to present any quantitative overhead numbers or tampering-detection experiments referenced in the abstract.
  2. Notation for evidence fields (policy, route, endpoint, stream commitments) should be defined consistently if introduced in later sections.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The comment on the abstract is well-taken and highlights the need for clearer linkage between the high-level claims and the supporting details in the body of the paper. We address the point below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract (paragraph on AGR and prototype): the central claim that the architecture enables clients to detect policy/routing/endpoint/stream tampering rests on the unelaborated assumption that enclave measurement and isolation suffice; no threat model, formal argument, or empirical tampering-detection results are supplied to substantiate fail-closed behavior beyond the high-level prototype statement.

    Authors: We agree that the abstract presents the fail-closed detection claim at a summary level without explicit references. The architecture depends on the standard attestation and isolation properties of AWS Nitro Enclaves (measurement-bound keys and runtime isolation from the operator control plane), which are described in the design section. The Rust prototype implements the attested runtime such that any attempt to tamper with policy, routing, endpoint selection, or stream evidence outside the AGR produces verifiable failures during client-side evidence checking. However, the abstract does not sufficiently point to these elements. We will revise the abstract to include a brief reference to the threat model (standard TEE adversary controlling the host but not the enclave) and the prototype evaluation, and we will ensure the body explicitly summarizes the empirical tampering-detection results. This makes the assumptions and evidence explicit without altering the technical approach. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper presents a system architecture and prototype for an evidence-bound LLM gateway using attested execution on AWS Nitro Enclaves. There are no equations, fitted parameters, predictions, or derivations that reduce to self-definition or self-citation. The central claims rest on the independently established isolation and measurement-binding properties of the cited enclave platform rather than any internal loop or renaming of results. No load-bearing self-citations or ansatzes are present.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The design rests on the security properties of hardware attestation enclaves and introduces the AGR as a new trusted component; no free parameters or additional invented entities beyond the runtime itself.

axioms (1)
  • domain assumption Hardware attestation mechanisms such as AWS Nitro Enclaves provide reliable measurement and isolation from the host operator
    Invoked when clients verify attestation and bind keys to the AGR measurement
invented entities (1)
  • Attested Gateway Runtime (AGR) no independent evidence
    purpose: Enforce request-scoped routing, fallback, and endpoint constraints while signing evidence inside the enclave
    Core new component introduced to separate operator control from attested execution

pith-pipeline@v0.9.1-grok · 5736 in / 1261 out tokens · 28707 ms · 2026-06-26T10:05:19.324852+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

43 extracted references · 13 canonical work pages

  1. [1]

    Amazon Web Services. 2023. Validating Attestation Documents Produced by AWS Nitro Enclaves. https://aws.amazon.com/blogs/compute/validating- attestation-documents-produced-by-aws-nitro-enclaves/. Accessed 2026-06-02

  2. [2]

    Amazon Web Services. 2026. Cryptographic Attestation. https://docs.aws.amazo n.com/enclaves/latest/user/set-up-attestation.html. Accessed 2026-06-02

  3. [3]

    Stillwell, David Goltzsche, David Eyers, Rüdiger Kapitza, Peter Pietzuch, and Christof Fetzer

    Sergei Arnautov, Bohdan Trach, Franz Gregor, Thomas Knauth, Andre Martin, Christian Priebe, Joshua Lind, Divya Muthukumaran, Dan O’Keeffe, Mark L. Stillwell, David Goltzsche, David Eyers, Rüdiger Kapitza, Peter Pietzuch, and Christof Fetzer. 2016. SCONE: Secure Linux Containers with Intel SGX. In12th USENIX Symposium on Operating Systems Design and Implem...

  4. [4]

    Richard Barnes, Karthikeyan Bhargavan, Benjamin Lipp, and Christopher A. Wood. 2022. Hybrid Public Key Encryption. RFC 9180. doi:10.17487/RFC9180

  5. [5]

    Andrew Baumann, Marcus Peinado, and Galen Hunt. 2014. Shielding Appli- cations from an Untrusted Cloud with Haven. In11th USENIX Symposium on Operating Systems Design and Implementation. USENIX Association, 267–283

  6. [6]

    Henk Birkholz, Dave Thaler, Michael Richardson, Ned Smith, and Wei Pan. 2023. Remote ATtestation procedureS (RATS) Architecture. RFC 9334. doi:10.17487/R FC9334

  7. [7]

    Cloudflare. 2026. Cloudflare AI Gateway. https://developers.cloudflare.com/ai- gateway/. Accessed 2026-06-19

  8. [8]

    George Coker, Joshua Guttman, Peter Loscocco, Amy Herzog, Jonathan Millen, Brian O’Hanlon, John Ramsdell, Ariel Segall, Justin Sheehy, and Brian Sniffen

  9. [9]

    doi:10.1007/s10207-011-0124-7

    Principles of Remote Attestation.International Journal of Information Security10, 2 (2011), 63–81. doi:10.1007/s10207-011-0124-7

  10. [10]

    Edoardo Debenedetti, Jie Zhang, Mislav Balunović, Luca Beurer-Kellner, Marc Fischer, and Florian Tramèr. 2024. AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents. https://arxiv. org/abs/2406.13352. arXiv:2406.13352 [cs.CR]

  11. [12]

    Fielding, Mark Nottingham, and Julian Reschke

    Roy T. Fielding, Mark Nottingham, and Julian Reschke. 2022. HTTP Semantics. RFC 9110. doi:10.17487/RFC9110

  12. [13]

    Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. 2023. Not What You’ve Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection. In Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security. ACM, 79–90. doi:10.1145/3605764.3623985

  13. [14]

    Yuepeng Hu, Yuqi Jia, Mengyuan Li, Dawn Song, and Neil Gong. 2026. MalTool: Malicious Tool Attacks on LLM Agents. https://arxiv.org/abs/2602.12194. arXiv:2602.12194 [cs.CR]

  14. [15]

    Tyler Hunt, Congzheng Song, Reza Shokri, Vitaly Shmatikov, and Emmett Witchel. 2018. Chiron: Privacy-preserving Machine Learning as a Service. https://arxiv.org/abs/1803.05961. arXiv:1803.05961

  15. [16]

    LiteLLM. 2026. LiteLLM: AI Gateway for Model Access, Fallbacks, and Spend Tracking. https://www.litellm.ai/. Accessed 2026-06-19

  16. [17]

    Hanzhi Liu, Chaofan Shou, Hongbo Wen, Yanju Chen, Ryan Jingyang Fang, and Yu Feng. 2026. Your Agent Is Mine: Measuring Malicious Intermedi- ary Attacks on the LLM Supply Chain. https://arxiv.org/abs/2604.08407. arXiv:2604.08407 [cs.CR]

  17. [18]

    Zachary Newman, John Speed Meyers, and Santiago Torres-Arias. 2022. Sig- store: Software Signing for Everybody. InProceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security. ACM, 2353–2367. doi:10 .1145/3548606.3560596

  18. [19]

    OpenRouter. 2026. Model Fallbacks: Reliable AI with Automatic Failover. https: //openrouter.ai/docs/guides/routing/model-fallbacks. Accessed 2026-06-19

  19. [20]

    Bryan Parno. 2008. Bootstrapping Trust in a “Trusted” Platform. In3rd USENIX Workshop on Hot Topics in Security (HotSec 08). USENIX Association, San Jose, CA

  20. [21]

    Kornaropoulos, and Giuseppe Ateniese

    Dario Pasquini, Evgenios M. Kornaropoulos, and Giuseppe Ateniese. 2025. LLMmap: Fingerprinting for Large Language Models. https://arxiv.org/ abs/2407.15847. In34th USENIX Security Symposium. USENIX Association. arXiv:2407.15847 [cs.CR]

  21. [22]

    Trevor Perrin. 2018. The Noise Protocol Framework. https://noiseprotocol.org/ noise.html

  22. [23]

    Portkey. 2026. Portkey AI Gateway Documentation. https://portkey.ai/docs. Accessed 2026-06-19

  23. [24]

    Zilan Qian. 2026. How to Buy Cheap Claude Tokens in China. https://www.chin atalk.media/p/how-to-buy-cheap-claude-tokens-in. Accessed 2026-05-21

  24. [25]

    Reproducible Builds Project. 2026. Reproducible Builds. https://reproducible- builds.org/. Accessed 2026-06-02

  25. [26]

    Mohamed Sabt, Mohammed Achemlal, and Abdelmadjid Bouabdallah. 2015. Trusted Execution Environment: What It is, and What It is Not. 2015 IEEE Trustcom/BigDataSE/ISPA. doi:10.1109/Trustcom.2015.357

  26. [27]

    Peter Saint-Andre and Rich Salz. 2023. Service Identity in TLS. RFC 9525. doi:10.17487/RFC9525

  27. [28]

    Justin Samuel, Nick Mathewson, Justin Cappos, and Roger Dingledine. 2010. Survivable Key Compromise in Software Update Systems. InProceedings of the 17th ACM Conference on Computer and Communications Security. ACM, 61–72. doi:10.1145/1866307.1866315

  28. [29]

    Jim Schaad. 2022. CBOR Object Signing and Encryption (COSE): Structures and Process. RFC 9052. doi:10.17487/RFC9052

  29. [30]

    Moritz Schneider, Ramya Jayaram Masti, Shweta Shinde, Srdjan Capkun, and Ronald Perez. 2022. SoK: Hardware-supported Trusted Execution Environments. https://arxiv.org/abs/2205.12742. arXiv:2205.12742 [cs.CR]

  30. [31]

    SLSA Framework. 2026. SLSA: Supply-chain Levels for Software Artifacts Speci- fication. https://slsa.dev/spec/. Accessed 2026-06-02

  31. [32]

    Santiago Torres-Arias, Hammad Afzali, Trishank Karthik Kuppusamy, Reza Curtmola, and Justin Cappos. 2019. in-toto: Providing Farm-to-Table Guarantees for Bits and Bytes. In28th USENIX Security Symposium. USENIX Association, 1393–1410

  32. [33]

    Florian Tramèr and Dan Boneh. 2019. Slalom: Fast, Verifiable and Private Execu- tion of Neural Networks in Trusted Hardware. https://arxiv.org/abs/1806.03287. InInternational Conference on Learning Representations

  33. [34]

    Wenisch, Yuval Yarom, and Raoul Strackx

    Jo Van Bulck, Marina Minkin, Ofir Weisse, Daniel Genkin, Baris Kasikci, Frank Piessens, Mark Silberstein, Thomas F. Wenisch, Yuval Yarom, and Raoul Strackx

  34. [35]

    In27th USENIX Security Symposium

    Foreshadow: Extracting the Keys to the Intel SGX Kingdom with Tran- sient Out-of-Order Execution. In27th USENIX Security Symposium. USENIX Association, 991–1008

  35. [36]

    Yifei Wang, Dizhan Xue, Shengjie Zhang, and Shengsheng Qian. 2024. BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents. https://arxiv.org/ab s/2406.03007. arXiv:2406.03007 [cs.CL]

  36. [37]

    Yuanzhong Xu, Weidong Cui, and Marcus Peinado. 2015. Controlled-Channel Attacks: Deterministic Side Channels for Untrusted Operating Systems. In2015 IEEE Symposium on Security and Privacy. 640–656. doi:10.1109/SP.2015.45

  37. [38]

    Jiangou Zhan, Wenhui Zhang, Zheng Zhang, Huanran Xue, Yao Zhang, and Ye Wu. 2025. Portcullis: A Scalable and Verifiable Privacy Gateway for Third-Party LLM Inference. Proceedings of the AAAI Conference on Artificial Intelligence. doi:10.1609/aaai.v39i1.32088

  38. [39]

    Chengliang Zhang, Shuang Li, Junzhe Xia, Wei Wang, Feng Yan, and Yang Liu

  39. [40]

    doi:10.110 9/ACCESS.2021.3136889

    Confidential Machine Learning Computation in Untrusted Environments: A Systems Security Perspective.IEEE Access9 (2021), 168656–168706. doi:10.110 9/ACCESS.2021.3136889

  40. [41]

    Yage Zhang, Yukun Jiang, Zeyuan Chen, Michael Backes, Xinyue Shen, and Yang Zhang. 2026. Real Money, Fake Models: Deceptive Model Claims in Shadow APIs. https://arxiv.org/abs/2603.01919. arXiv:2603.01919 [cs.CR] doi:10.48550/a rXiv.2603.01919

  41. [42]

    Zecheng Zhang, Han Zheng, and Yue Xu. 2026. SEAR: Schema-Based Eval- uation and Routing for LLM Gateways. https://arxiv.org/abs/2603.26728. arXiv:2603.26728 [cs.DB] doi:10.48550/arXiv.2603.26728

  42. [43]

    Xiaoyuan Zhu, Yaowen Ye, Tianyi Qiu, Hanlin Zhu, Sijun Tan, Ajraf Mannan, Jonathan Michala, Raluca Ada Popa, and Willie Neiswanger. 2025. Auditing Black-Box LLM APIs with a Rank-Based Uniformity Test. https://arxiv.org/abs/ 2506.06975. arXiv:2506.06975 [cs.CR]

  43. [44]

    Wei Zou, Runpeng Geng, Binghui Wang, and Jinyuan Jia. 2025. PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models. https://arxiv.org/abs/2402.07867. arXiv:2402.07867 [cs.CR]