arxiv: 2604.18231 · v2 · submitted 2026-04-20 · 💻 cs.CR · cs.OS

Recognition: unknown

AgenTEE: Confidential LLM Agent Execution on Edge Devices

Amir Al Sadi, David Kotz, Hamed Haddadi, Javad Forough, Josh Millar, Marios Kogias, Mohammad M Maheri, Sina Abdollahi

Authors on Pith no claims yet

Pith reviewed 2026-05-10 04:43 UTC · model grok-4.3

classification 💻 cs.CR cs.OS

keywords LLM agentsconfidential virtual machinesedge devicesArm CCAsecurity isolationperformance overheadprivacy protectionagent pipelines

0 comments

The pith

AgenTEE isolates LLM agent components in attested confidential virtual machines to enable secure edge execution.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents AgenTEE as a system for running large language model agents directly on edge devices while protecting proprietary elements like system prompts, model weights, and runtime state. It tackles the expanded attack surface created by non-deterministic models and third-party integrations on platforms that may be controlled by malicious users. AgenTEE achieves isolation by placing the agent runtime, inference engine, and external applications into separate confidential virtual machines that undergo independent attestation. Communication between these components is restricted to explicit, verifiable channels. The design relies on Arm Confidential Compute Architecture and demonstrates practicality through performance measurements showing minimal slowdown relative to conventional multi-process setups.

Core claim

AgenTEE places the agent runtime, inference engine, and third-party applications into independently attested confidential virtual machines (cVMs) and mediates their interaction through explicit, verifiable communication channels. Built on Arm Confidential Compute Architecture (CCA), AgenTEE enforces strong system-level isolation of sensitive assets and runtime state. Our evaluation shows that such multi-cVMs system is practical, achieving near-native performance with less than 5.15% runtime overhead compared to commodity OS multi-process deployments.

What carries the argument

Multi-cVM architecture with independent attestation and explicit mediation channels on Arm CCA, which isolates the runtime, inference engine, and third-party components to safeguard proprietary assets and state.

If this is right

LLM agents can run on edge devices while keeping sensitive prompts and weights protected from local software attacks.
Third-party applications can participate in agent pipelines without gaining access to core runtime state.
Edge-based automation becomes feasible for tasks that require both low latency and strong privacy guarantees.
The performance cost of this isolation stays low enough to support real-time agent operation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same cVM separation pattern could support other non-LLM AI pipelines that combine models with external services on edge hardware.
Hardware vendors might prioritize broader attestation coverage to reduce reliance on explicit channels alone.
Developers could test the approach on additional edge platforms to map where Arm CCA support is available.

Load-bearing premise

Arm CCA attestation and the explicit mediation channels between cVMs suffice to block software attacks and malicious device owners, with no unaddressed side channels or attestation bypasses.

What would settle it

A working attack that extracts system prompts or model weights from an AgenTEE deployment on an Arm CCA edge device despite the cVM isolation and attested channels.

Figures

Figures reproduced from arXiv: 2604.18231 by Amir Al Sadi, David Kotz, Hamed Haddadi, Javad Forough, Josh Millar, Marios Kogias, Mohammad M Maheri, Sina Abdollahi.

**Figure 2.** Figure 2: Arm Confidential Compute Architecture and tailored prompt injection attacks [4, 21], data exfiltration [17, 37], and secret leakage [18]. This observation highlights that protecting internal agent logic is not merely an intellectual property concern, but a fundamental security requirement. Inference Engine. The inference engine is the runtime component responsible for executing a trained model to produc… view at source ↗

**Figure 3.** Figure 3: AgenTEE’s system architecture were originally designed for a conventional execution environment. This supports more complex, multi-component agent pipelines—spanning the runtime, model, and thirdparty services—and makes it practical to deploy components from heterogeneous trust domains on commodity edge devices without specialized hardware configurations. Second, CCA’s RISC-based architectural design en… view at source ↗

read the original abstract

Large Language Model (LLM) agents provide powerful automation capabilities, but they also create a substantially broader attack surface than traditional applications due to their tight integration with non-deterministic models and third-party services. While current deployments primarily rely on cloud-hosted services, emerging designs increasingly execute agents directly on edge devices to reduce latency and enhance user privacy. However, securely hosting such complex agent pipelines on edge devices remains challenging. These deployments must protect proprietary assets (e.g., system prompts and model weights) and sensitive runtime state on heterogeneous platforms that are vulnerable to software attacks and potentially controlled by malicious users. To address these challenges, we present AgenTEE, a system for deploying confidential agent pipelines on edge devices. AgenTEE places the agent runtime, inference engine, and third-party applications into independently attested confidential virtual machines (cVMs) and mediates their interaction through explicit, verifiable communication channels. Built on Arm Confidential Compute Architecture (CCA), a recent extension to Arm platforms, AgenTEE enforces strong system-level isolation of sensitive assets and runtime state. Our evaluation shows that such multi-cVMs system is practical, achieving near-native performance with less than 5.15% runtime overhead compared to commodity OS multi-process deployments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

AgenTEE shows a workable multi-cVM split for LLM agents on Arm CCA edge devices with under 5.15% overhead, but the security case against malicious owners stays thin on side channels and attestation details.

read the letter

AgenTEE splits the agent runtime, inference engine, and third-party apps into separate attested confidential VMs on Arm CCA hardware, with explicit channels handling communication between them. The goal is to protect prompts, weights, and state on edge devices that could be under malicious control. This multi-cVM tailoring to agent pipelines is the main new element compared to standard single-VM confidential workloads.

Referee Report

2 major / 1 minor

Summary. The paper presents AgenTEE, a system for confidential execution of LLM agents on edge devices. It isolates the agent runtime, inference engine, and third-party applications in independently attested confidential virtual machines (cVMs) built on Arm Confidential Compute Architecture (CCA), with interactions mediated through explicit verifiable communication channels. The work claims strong system-level isolation of sensitive assets (prompts, weights, runtime state) against software attacks and malicious device owners, while demonstrating practicality via near-native performance with less than 5.15% runtime overhead relative to commodity OS multi-process deployments.

Significance. If the security guarantees hold, the result is significant for enabling privacy-preserving LLM agent execution on untrusted edge hardware. The engineering focus on multi-cVM isolation with mediated channels and the reported low overhead provide a concrete path toward practical confidential AI on heterogeneous platforms, addressing a growing need as agents move from cloud to edge.

major comments (2)

[Abstract] Abstract: The central claim of 'strong system-level isolation' protecting against software attacks from malicious device owners rests on Arm CCA attestation and cVM mediation, but the manuscript provides no concrete evidence or analysis ruling out side-channel leakage (e.g., via shared caches or memory controllers during LLM inference) or attestation bypasses; this leaves the confidentiality guarantee only partially supported.
[Abstract] Abstract/Evaluation: The reported '< 5.15% runtime overhead' is presented as a key practicality result, yet no details are given on the threat model coverage, side-channel analysis, measurement methodology, workloads, or baseline comparison setup, making it impossible to assess whether the number substantiates the security-plus-performance claim.

minor comments (1)

[Abstract] Abstract: The acronym 'cVMs' is used before its expansion ('confidential virtual machines') is provided, which could be clarified on first use for readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and constructive review. The comments correctly identify areas where the abstract and security discussion could be strengthened with additional clarification. We address each point below and have revised the manuscript to incorporate explicit discussion of threat model boundaries and expanded evaluation details.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim of 'strong system-level isolation' protecting against software attacks from malicious device owners rests on Arm CCA attestation and cVM mediation, but the manuscript provides no concrete evidence or analysis ruling out side-channel leakage (e.g., via shared caches or memory controllers during LLM inference) or attestation bypasses; this leaves the confidentiality guarantee only partially supported.

Authors: We appreciate the referee highlighting the need for clearer boundaries on our security claims. Section 3 defines the threat model as software attacks originating from a malicious device owner (e.g., via compromised hypervisor or rich OS), which Arm CCA is designed to mitigate through hardware-enforced cVM isolation and attestation. The attestation mechanism relies on the hardware root of trust to verify cVM images and configurations, preventing bypasses at the software level. However, we agree that the manuscript does not explicitly analyze hardware side-channels such as cache or memory-controller leakage during LLM inference. We have revised the Security Analysis section to add a dedicated paragraph stating that our guarantees are scoped to software attacks, that side-channel resistance is not claimed without additional mitigations (e.g., constant-time code or cache partitioning), and that such attacks remain an open consideration for future work, with references to related literature on confidential computing side-channels. This revision clarifies the scope without overstating the guarantees. revision: yes
Referee: [Abstract] Abstract/Evaluation: The reported '< 5.15% runtime overhead' is presented as a key practicality result, yet no details are given on the threat model coverage, side-channel analysis, measurement methodology, workloads, or baseline comparison setup, making it impossible to assess whether the number substantiates the security-plus-performance claim.

Authors: We acknowledge that the abstract is too brief and does not direct readers to the supporting details. The full evaluation (Section 5) specifies the workloads (representative LLM agent pipelines involving tool invocation and multi-turn reasoning with models such as Llama-7B), the baseline (identical agent deployment using standard Linux multi-process isolation on the same Arm edge hardware), and the measurement approach (cycle-accurate timers with repeated runs to report average overhead). Threat model coverage is described in Section 3. Side-channel considerations are now addressed in the revised Security Analysis section as noted above. We have updated the abstract to include a concise reference to the evaluation setup and added a summary table in Section 5 that explicitly lists methodology, workloads, and baseline configuration. These changes make the reported overhead verifiable while preserving the original experimental results. revision: yes

Circularity Check

0 steps flagged

No circularity: engineering system relies on external hardware and empirical benchmarks

full rationale

The paper presents a system architecture for confidential LLM agent execution on edge devices using Arm CCA cVMs and explicit mediation channels. No equations, fitted parameters, or self-referential definitions appear in the provided text. Security and performance claims rest on the external properties of Arm CCA attestation plus direct timing comparisons to commodity OS multi-process setups, without any reduction of the central result to its own inputs by construction. No self-citations are load-bearing for the isolation guarantees, and the work contains no ansatzes, uniqueness theorems, or renamings of prior results that would trigger the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the assumption that Arm CCA delivers strong isolation and attestation for multiple cVMs, plus the design choice of explicit mediation channels. No free parameters or new physical entities are introduced.

axioms (1)

domain assumption Arm CCA provides hardware-enforced isolation and remote attestation for confidential VMs that is sufficient against software attacks on edge devices.
Invoked throughout the system description as the foundation for security guarantees.

invented entities (1)

Explicit verifiable communication channels between cVMs no independent evidence
purpose: To mediate interactions between agent runtime, inference engine, and third-party apps while preserving isolation.
New design element introduced by the paper; no independent evidence of its security properties is provided beyond the abstract.

pith-pipeline@v0.9.0 · 5535 in / 1215 out tokens · 43308 ms · 2026-05-10T04:43:18.143139+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

62 extracted references · 17 canonical work pages · 4 internal anchors

[1]

kvmtool-cca.https://gitlab.arm.com/linux-arm/kvmtool-cca/- /tree/cca/v3?ref_type=headsAccessed Feb 2025

2025. kvmtool-cca.https://gitlab.arm.com/linux-arm/kvmtool-cca/- /tree/cca/v3?ref_type=headsAccessed Feb 2025

2025
[2]

Sina Abdollahi, Mohammad Maheri, Sandra Siby, Marios Kogias, and Hamed Haddadi. 2025. An Early Experience with Confidential Com- puting Architecture for On-Device Model Protection.arXiv preprint arXiv:2504.08508(2025)

work page arXiv 2025
[3]

Sina Abdollahi, Amir Al Sadi, Marios Kogias, David Kotz, and Hamed Haddadi. 2025. Confidential, Attestable, and Efficient Inter-CVM Com- munication with Arm CCA.arXiv preprint arXiv:2512.01594(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[4]

Divyansh Agarwal, Alexander R Fabbri, Ben Risher, Philippe Laban, Shafiq Joty, and Chien-Sheng Wu. 2024. Prompt leakage effect and defense strategies for multi-turn llm interactions.arXiv preprint arXiv:2404.16251(2024)

work page arXiv 2024
[5]

Yongqi An, Xu Zhao, Tao Yu, Ming Tang, and Jinqiao Wang. 2025. Systematic outliers in large language models.arXiv preprint arXiv:2502.06415(2025)

work page arXiv 2025
[6]

Android. 2025. AVF architecture.https://source.android.com/docs/ core/virtualization/architecture#memory-ownershipAccessed July 2025

2025
[7]

Anthropic. 2025. Effective context engineering for AI agents.https: //www.anthropic.com/engineering/effective-context-engineering- for-ai-agentsAccessed Feb 2026

2025
[8]

Apple Inc. 2025. Secure Enclave.https://support.apple.com/en- gb/guide/security/sec59b0b31ff/webAccessed Feb 2025

2025
[9]

Arm Limited. 2025. Arm Confidential Compute Architecture.https: //www.arm.com/architecture/security-features/arm-confidential- compute-architectureAccessed Feb 2025

2025
[10]

Arm Limited. 2025. Learn the architecture - TrustZone for AArch64. https://developer.arm.com/documentation/102418/latest/Accessed Feb 2025

2025
[11]

Arm Limited. 2025. linux-cca.https://gitlab.arm.com/linux-arm/linux- cca/-/commit/fad35572dbAccessed Feb 2025

2025
[12]

Andrin Bertschi and Shweta Shinde. 2025. OpenCCA: An Open Frame- work to Enable Arm CCA Research.arXiv preprint arXiv:2506.05129 (2025)

work page arXiv 2025
[13]

Ferdinand Brasser, David Gens, Patrick Jauernig, Ahmad-Reza Sadeghi, and Emmanuel Stapf. 2019. SANCTUARY: ARMing TrustZone with User-space Enclaves.. InNDSS

2019
[14]

Wei Chen, Zhiyuan Li, Zhen Guo, and Yikang Shen. 2025. Octo-planner: On-device language model for planner-action agents. InInternational Workshop on Engineering Multi-Agent Systems. Springer, 141–156

2025
[15]

Edoardo Debenedetti, Ilia Shumailov, Tianqi Fan, Jamie Hayes, Nicholas Carlini, Daniel Fabian, Christoph Kern, Chongyang Shi, An- dreas Terzis, and Florian Tramèr. 2025. Defeating prompt injections by design.arXiv preprint arXiv:2503.18813(2025)

work page internal anchor Pith review arXiv 2025
[16]

Tim Dettmers, Mike Lewis, Younes Belkada, and Luke Zettlemoyer
[17]

int8 (): 8-bit matrix multiplication for transformers at scale.Advances in neural information processing systems35 (2022), 30318–30332

Gpt3. int8 (): 8-bit matrix multiplication for transformers at scale.Advances in neural information processing systems35 (2022), 30318–30332

2022
[18]

Embrace The Red. 2025. Claude Code: Data Exfiltration with DNS (CVE-2025-55284).https://embracethered.com/blog/posts/2025/clau de-code-exfiltration-via-dns-requests/Accessed Feb 2026

2025
[19]

Embrace The Red. 2025. How Devin AI Can Leak Your Secrets via Multiple Means.https://embracethered.com/blog/posts/2025/devin- can-leak-your-secrets/Accessed Feb 2026

2025
[20]

Ran Gilad-Bachrach, Nathan Dowlin, Kim Laine, Kristin Lauter, Michael Naehrig, and John Wernsing. 2016. CryptoNets: Applying neural networks to encrypted data with high throughput and accuracy. InInternational Conference on Machine Learning. PMLR, 201–210

2016
[21]

Google Cloud. 2026. Use system instructions (Generative AI on Vertex AI).https://docs.cloud.google.com/vertex-ai/generative-ai/docs/lear n/prompts/system-instructionsAccessed Feb 2026

2026
[22]

Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. 2023. Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injection. InProceedings of the 16th ACM workshop on artificial intelligence and security. 79–90

2023
[23]

Hugging Face. [n. d.]. bartowski/Llama-3.2-1B-Instruct-GGUF.https: //huggingface.co/bartowski/Llama-3.2-1B-Instruct-GGUFAccessed Feb 2025

2025
[24]

Hugging Face. 2019. openai-community/gpt2-medium.https://hugg ingface.co/openai-community/gpt2-mediumAccessed Feb 2025

2019
[25]

Bo Hui, Haolin Yuan, Neil Gong, Philippe Burlina, and Yinzhi Cao
[26]

InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security

Pleak: Prompt leaking attacks against large language model applications. InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security. 3600–3614

2024
[27]

Intel®. 2025. Intel Software Guard Extensions. Retrieved June 7, 2025 fromhttps://www.intel.com/content/www/us/en/developer/tools/so ftware-guard-extensions/overview.html

2025
[28]

Mark Kuhne, Supraja Sridhara, Andrin Bertschi, Nicolas Dutly, Srdjan Capkun, and Shweta Shinde. 2024. Aster: Fixing the Android TEE ecosystem with Arm CCA.arXiv preprint arXiv:2407.16694(2024)

work page arXiv 2024
[29]

Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph Gonzalez, Hao Zhang, and Ion Stoica
[30]

InProceedings of the 29th symposium on operating systems principles

Efficient memory management for large language model serving with pagedattention. InProceedings of the 29th symposium on operating systems principles. 611–626. EuroMLSys ’26, April 27–30, 2026, Edinburgh, Scotland Uk Abdollahi et al

2026
[31]

LangChain. 2026. Langchain: Build context-aware, reasoning ap- plications with langchain’s flexible abstractions and ai-first toolkit. https://www.langchain.com/Accessed Feb 2026

2026
[32]

Linaro. 2024. MAD24-410 Arm Confidential Compute Architecture open-source enablement update. Retrieved March 9, 2025 fromhttps: //resources.linaro.org/en/resource/rEjhEezEvnNMC3LALzUTrr

2024
[33]

Linux. 2025. seccomp(2) — Linux manual page.https://man7.org/lin ux/manpages/man2/seccomp.2.htmlAccessed Feb 2026

2025
[34]

Zhifan Luo, Shuo Shao, Su Zhang, Lijing Zhou, Yuke Hu, Chenxu Zhao, Zhihao Liu, and Zhan Qin. 2025. Shadow in the cache: Unveiling and mitigating privacy risks of kv-cache in llm inference.arXiv preprint arXiv:2508.09442(2025)

work page arXiv 2025
[35]

Mohammad M Maheri, Sunil Cotterill, Alex Davidson, and Hamed Haddadi. 2025. ZK-APEX: Zero-Knowledge Approximate Personalized Unlearning with Executable Proofs.arXiv preprint arXiv:2512.09953 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[36]

Mohammad M Maheri, Hamed Haddadi, and Alex Davidson. 2025. Telesparse: Practical privacy-preserving verification of deep neural networks.arXiv preprint arXiv:2504.19274(2025)

work page arXiv 2025
[37]

Microsoft 365. [n. d.]. Microsoft 365 Copilot hub.https://learn.micros oft.com/en-us/copilot/microsoft-365/Accessed Feb 2026

2026
[38]

Fan Mo, Hamed Haddadi, Kleomenis Katevas, Eduard Marin, Diego Perino, and Nicolas Kourtellis. 2021. PPFL: privacy-preserving feder- ated learning with trusted execution environments. InProceedings of the 19th annual international conference on mobile systems, applications, and services. 94–108

2021
[39]

OpenAI Developers. [n. d.]. Using tools.https://developers.openai.co m/api/docs/guides/toolsAccessed Feb 2026

2026
[40]

Zhenting Qi, Hanlin Zhang, Eric Xing, Sham Kakade, and Himabindu Lakkaraju. 2024. Follow my instruction and spill the beans: Scalable data extraction from retrieval-augmented generation systems.arXiv preprint arXiv:2402.17840(2024)

work page arXiv 2024
[41]

Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, et al. 2023. Toolllm: Facilitating large language models to master 16000+ real-world apis. arXiv preprint arXiv:2307.16789(2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[42]

Radxa. [n. d.]. ROCK 5B. Retrieved April 14, 2025 fromhttps://radxa. com/products/rock5/5b/

2025
[43]

M Sadegh Riazi, Christian Weinert, Oleksandr Tkachenko, Ebrahim M Songhori, Thomas Schneider, and Farinaz Koushanfar. 2018. Chameleon: A hybrid secure computation framework for machine learning applications. InProceedings of the 2018 on Asia Conference on Computer and Communications Security. 707–721

2018
[44]

Fan Sang, Jaehyuk Lee, Xiaokuan Zhang, and Taesoo Kim. 2025. POR- TAL: Fast and Secure Device Access with Arm CCA for Modern Arm Mobile System-on-Chips (SoCs). In2025 IEEE Symposium on Security and Privacy (SP). IEEE, 4099–4116

2025
[45]

Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Eric Hambro, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom. 2023. Toolformer: Language models can teach themselves to use tools.Advances in neural information processing systems36 (2023), 68539–68551

2023
[46]

Tianxiang Shen, Ji Qi, Jianyu Jiang, Xian Wang, Siyuan Wen, Xusheng Chen, Shixiong Zhao, Sen Wang, Li Chen, Xiapu Luo, et al . 2022. SOTER: Guarding Black-box Inference for General Neural Networks at the Edge. In2022 USENIX Annual Technical Conference (USENIX ATC 22). 723–738

2022
[47]

Sandra Siby, Sina Abdollahi, Mohammad Maheri, Marios Kogias, and Hamed Haddadi. 2024. GuaranTEE: Towards Attestable and Private ML with CCA. InProceedings of the 4th Workshop on Machine Learning and Systems. 1–9

2024
[48]

Supraja Sridhara, Andrin Bertschi, Benedict Schlüter, Mark Kuhne, Fabio Aliberti, and Shweta Shinde. 2024. ACAI: Extending Arm Confi- dential Computing Architecture Protection from CPUs to Accelerators. In33rd USENIX Security Symposium (USENIX Security’24)

2024
[49]

Lizhi Sun, Shuocheng Wang, Hao Wu, Yuhang Gong, Fengyuan Xu, Yunxin Liu, Hao Han, and Sheng Zhong. 2022. LEAP: TrustZone Based Developer-Friendly TEE for Intelligent Mobile Apps.IEEE Transactions on Mobile Computing(2022)

2022
[50]

Zhichuang Sun, Ruimin Sun, Changming Liu, Amrita Roy Chowdhury, Long Lu, and Somesh Jha. 2023. ShadowNet: A secure and efficient on-device model inference system for convolutional neural networks. In2023 IEEE Symposium on Security and Privacy (SP). IEEE, 1596–1612

2023
[51]

trusted firmware. 2025. TF-A.https://www.trustedfirmware.org/proj ects/tf-aAccessed Feb 2025

2025
[52]

TrustedFirmware. 2025. TF-RMM.https://www.trustedfirmware.org/ projects/tf-rmmAccessed Feb 2025

2025
[53]

Tim van Elsloo, Giorgio Patrini, and Hamish Ivey-Law. 2019. SEALion: A framework for neural network inference on encrypted data.arXiv preprint arXiv:1904.12840(2019)

work page arXiv 2019
[54]

Chenxu Wang, Fengwei Zhang, Yunjie Deng, Kevin Leach, Jiannong Cao, Zhenyu Ning, Shoumeng Yan, and Zhengyu He. 2024. CAGE: Complementing Arm CCA with GPU Extensions. InNetwork and Distributed System Security (NDSS) Symposium

2024
[55]

Junyang Wang, Haiyang Xu, Jiabo Ye, Ming Yan, Weizhou Shen, Ji Zhang, Fei Huang, and Jitao Sang. 2024. Mobile-agent: Autonomous multi-modal mobile device agent with visual perception.arXiv preprint arXiv:2401.16158(2024)

work page arXiv 2024
[56]

Wikipedia. [n. d.]. Widevine.https://en.wikipedia.org/wiki/Widevine Accessed Feb 2026

2026
[57]

Guanlong Wu, Zheng Zhang, Yao Zhang, Weili Wang, Jianyu Niu, Ye Wu, and Yinqian Zhang. 2025. I Know What You Asked: Prompt Leakage via KV-Cache Sharing in Multi-Tenant LLM Serving.. InNDSS

2025
[58]

Yuhao Wu, Franziska Roesner, Tadayoshi Kohno, Ning Zhang, and Umar Iqbal. 2024. Isolategpt: An execution isolation architecture for llm-based agentic systems.arXiv preprint arXiv:2403.04960(2024)

work page arXiv 2024
[59]

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik R Narasimhan, and Yuan Cao. 2022. React: Synergizing reasoning and acting in language models. InThe eleventh international conference on learning representations

2022
[60]

Mengxia Yu, De Wang, Qi Shan, Colorado J Reed, and Alvin Wan
[61]

The super weight in large language models.arXiv preprint arXiv:2411.07191(2024)

work page arXiv 2024
[62]

Wanru Zhao, Vidit Khazanchi, Haodi Xing, Xuanli He, Qiongkai Xu, and Nicholas Donald Lane. 2024. Attacks on third-party apis of large language models.arXiv preprint arXiv:2404.16891(2024)

work page arXiv 2024