pith. machine review for the scientific record. sign in

arxiv: 2605.10907 · v2 · submitted 2026-05-11 · 💻 cs.CR · cs.AI

Recognition: no theorem link

Engineering Robustness into Personal Agents with the AI Workflow Store

Columbia University), Lillian Tsai (Google), Mariana Raykova (Google), Pierre Tholoniat (Google), Roxana Geambasu (Google, Trishita Tiwari (Google), Wen Zhang (Google)

Pith reviewed 2026-05-13 03:05 UTC · model grok-4.3

classification 💻 cs.CR cs.AI
keywords AI agentssoftware engineeringrobustnessworkflowssecurityreliabilityagent systemsreuse
0
0 comments X

The pith

AI agents must incorporate rigorous software engineering through reusable hardened workflows to achieve production-grade reliability and security.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Current AI agents synthesize plans and execute actions rapidly in response to prompts, bypassing the iterative design, testing, and evaluation that underpin reliable software. This approach may leave users with fragile prototypes unsuitable for important uses. The paper proposes an AI Workflow Store containing pre-hardened, reusable workflows that agents can call upon for better performance. Amortizing the engineering effort across many users would make the added rigor feasible. The work highlights the flexibility-robustness tradeoff and calls for moving past purely on-the-fly methods.

Core claim

By focusing on rapid, real-time synthesis, AI agents are delivering improvised prototypes rather than systems fit for high-stakes scenarios. To address this, the integration of disciplined software engineering processes into the agentic loop is necessary to produce hardened and deterministically-constrained workflows that substantially outperform brittle on-the-fly results, amortized via reuse in an AI Workflow Store.

What carries the argument

The AI Workflow Store, envisioned as a collection of hardened and reusable agent workflows that provide greater reliability and security than on-the-fly tool chains.

If this is right

  • Hardened workflows would allow agents to invoke pre-vetted plans with deterministic constraints, reducing vulnerability to errors or attacks.
  • The cost of rigorous processes like adversarial evaluation and staged deployment would be spread across a broad user base.
  • Agents could transition from prototypes to production-grade systems suitable for high-stakes applications.
  • Research must tackle challenges in workflow design to balance flexibility and robustness.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Community-driven curation of workflows could emerge, similar to package repositories, allowing continuous auditing and improvement.
  • Users might gain the ability to inspect and select workflows based on their verified properties, increasing transparency in agent behavior.
  • This model could support domain-specific workflow libraries for areas like finance or healthcare that demand high assurance.

Load-bearing premise

That the extra compute and time required for rigorous software engineering processes can be amortized through reuse across a broad user community without losing the responsiveness users expect.

What would settle it

Observing whether agents using workflows from the proposed store demonstrate measurably lower failure rates or security incidents compared to on-the-fly agents in controlled high-stakes simulations or real deployments.

Figures

Figures reproduced from arXiv: 2605.10907 by Columbia University), Lillian Tsai (Google), Mariana Raykova (Google), Pierre Tholoniat (Google), Roxana Geambasu (Google, Trishita Tiwari (Google), Wen Zhang (Google).

Figure 1
Figure 1. Figure 1: Problem: The Agentic AI code-and-execute loop short-circuits well-trodden SE processes that are the foundations of the relatively reliable and secure programs and services we enjoy today. These failures arise from the substantial ask of the “on-the-fly” loop: in seconds or minutes, and often for pennies, it must synthesize and execute multi-step plans: sending emails, moving money, booking travel, editing … view at source ↗
Figure 2
Figure 2. Figure 2: The AI Workflow Store architecture [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: positions our vision within the spectrum defined by the tension between flexibility (abil￾ity to respond to any user need with the right functionality) and robustness (reliability and security of that functionality). Traditional software sits at one extreme: highly robust through careful en￾gineering, but expensive to produce and limited in scope and flexibility. Purely on-the-fly agents sit at the other e… view at source ↗
read the original abstract

The dominant paradigm for AI agents is an "on-the-fly" loop in which agents synthesize plans and execute actions within seconds or minutes in response to user prompts. We argue that this paradigm short-circuits disciplined software engineering (SE) processes -- iterative design, rigorous testing, adversarial evaluation, staged deployment, and more -- that have delivered the (relatively) reliable and secure systems we use today. By focusing on rapid, real-time synthesis, are AI agents effectively delivering users improvised prototypes rather than systems fit for high-stakes scenarios in which users may unwittingly apply them? This paper argues for the need to integrate rigorous SE processes into the agentic loop to produce production-grade, hardened, and deterministically-constrained agent *workflows* that substantially outperform the potentially brittle and vulnerable results of on-the-fly synthesis. Doing so may require extra compute and time, and if so, we must amortize the cost of rigor through reuse across a broad user community. We envision an *AI Workflow Store* that consists of hardened and reusable workflows that agents can invoke with far greater reliability and security than improvised tool chains. We outline the research challenges of this vision, which stem from a broader flexibility-robustness tension that we argue requires moving beyond the ``on-the-fly'' paradigm to navigate effectively.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 0 minor

Summary. The paper argues that the prevailing on-the-fly paradigm for personal AI agents, characterized by rapid plan synthesis and action execution, short-circuits disciplined software engineering processes including iterative design, rigorous testing, adversarial evaluation, and staged deployment. Consequently, it questions whether such agents are delivering improvised prototypes rather than robust systems for high-stakes scenarios. The authors propose integrating SE rigor to create production-grade agent workflows and envision an AI Workflow Store for reusable, hardened workflows that agents can invoke, outlining associated research challenges arising from the flexibility-robustness tension.

Significance. Should the proposed approach prove viable, it would represent a significant advancement in engineering reliable AI agents by adapting established software engineering methodologies to the agentic setting, potentially mitigating security and robustness issues. The manuscript is credited for grounding its vision in standard SE benefits and for framing the idea as an open research direction requiring further investigation into cost amortization and the flexibility-robustness trade-off, without overclaiming empirical support.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive and insightful review, as well as their recommendation to accept the manuscript. Their summary accurately reflects our central argument that the dominant on-the-fly paradigm for personal AI agents circumvents established software engineering practices, and we appreciate the recognition that the work is framed as an open research direction without empirical overclaims.

Circularity Check

0 steps flagged

No significant circularity; position paper with independent argument

full rationale

The manuscript is a position paper advocating integration of software engineering processes into AI agent workflows via an AI Workflow Store to address flexibility-robustness tensions. It contains no equations, derivations, fitted parameters, or empirical predictions. The central claim is a high-level vision grounded in established SE principles (iterative design, testing, staged deployment) and does not reduce to self-citations, self-definitions, or renamed known results. No load-bearing steps are present that could exhibit circularity by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the domain assumption that traditional software engineering processes reliably produce more robust systems than rapid synthesis, and on the invented concept of a shared Workflow Store whose costs can be amortized.

axioms (1)
  • domain assumption Iterative design, rigorous testing, adversarial evaluation, and staged deployment produce more reliable and secure systems than on-the-fly synthesis.
    Invoked in the opening contrast between current agent loops and traditional SE success.
invented entities (1)
  • AI Workflow Store no independent evidence
    purpose: Repository of hardened, reusable, deterministically-constrained agent workflows that can be invoked instead of synthesized on the fly.
    Proposed as the central mechanism to amortize engineering costs and deliver robustness.

pith-pipeline@v0.9.0 · 5556 in / 1245 out tokens · 73061 ms · 2026-05-13T03:05:35.197055+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

65 extracted references · 65 canonical work pages · 8 internal anchors

  1. [1]

    Accessed: 2026-05-01.URL:https://agentskil ls.io/home

    AgentSkills.Agent Skills Overview – Agent Skills. Accessed: 2026-05-01.URL:https://agentskil ls.io/home

  2. [2]

    Accessed: 2026-05-01.URL:https : / / www

    Amazon.Alexa Skills — Amazon.com. Accessed: 2026-05-01.URL:https : / / www . amazon . com / alexa-skills

  3. [3]

    Accessed: May 2, 2026

    Anthropic.Writing tools for agents. Accessed: May 2, 2026. Anthropic. 2024.URL:https://www. anthropic.com/engineering/writing-tools-for-agents

  4. [4]

    Accessed: 2026-04-30

    Anthropic.Claude Code — Anthropic’s agentic coding system. Accessed: 2026-04-30. 2026.URL:htt ps://www.anthropic.com/product/claude-code

  5. [5]

    Accessed: 2026-05-07

    Anthropic.Public repository for Agent Skills. Accessed: 2026-05-07. 2026

  6. [6]

    Accessed: 2026-05-07

    Anthropic.Skill Creator. Accessed: 2026-05-07. 2026

  7. [7]

    Accessed: 2026-05-07

    Anthropic.The Complete Guide to Building Skills for Claude. Accessed: 2026-05-07. 2026

  8. [8]

    Accessed: 2026-05-07

    Anysphere, Inc.Cursor: The best way to code with AI. Accessed: 2026-05-07. 2026

  9. [9]

    AirGapAgent: Protecting Privacy-Conscious Conversational Agents

    Eugene Bagdasarian, Ren Yi, Sahra Ghalebikesabi, Peter Kairouz, Marco Gruteser, Sewoong Oh, Borja Balle, and Daniel Ramage. “AirGapAgent: Protecting Privacy-Conscious Conversational Agents”. In: (2024). arXiv:2405.05175 [cs.CR]

  10. [10]

    AgentBound: Securing Execution Boundaries of AI Agents

    Christoph B ¨uhler, Matteo Biagiola, Luca Di Grazia, and Guido Salvaneschi. “AgentBound: Securing Execution Boundaries of AI Agents”. In:Proceedings of the 34th ACM Joint European Software En- gineering Conference and Symposium on the Foundations of Software Engineering (FSE). V olume 3

  11. [11]

    Montreal, Canada: ACM, July 2026, page 24

  12. [12]

    Anthropic

    Nicholas Carlini.Building a C compiler with a team of parallel Claudes. Anthropic. Feb. 2026.URL: https://www.anthropic.com/engineering/building-c-compiler(visited on 04/30/2026)

  13. [13]

    Secalign: Defending against prompt injection with preference optimization

    Sizhe Chen, Arman Zharmagambetov, Saeed Mahloujifar, Kamalika Chaudhuri, David Wagner, and Chuan Guo. “Secalign: Defending against prompt injection with preference optimization”. In:Pro- ceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security. 2025, pages 2833–2847

  14. [14]

    LlamaFirewall: An open source guardrail system for building secure AI agents

    Sahana Chennabasappa, Cyrus Nikolaidis, Daniel Song, David Molnar, Stephanie Ding, Shengye Wan, Spencer Whitman, Lauren Deason, Nicholas Doucette, Abraham Montilla, Alekhya Gampa, Beto de Paola, Dominik Gabi, James Crnkovich, Jean-Christophe Testud, Kat He, Rashnil Chaturvedi, Wu Zhou, and Joshua Saxe. “LlamaFirewall: An open source guardrail system for b...

  15. [15]

    Accessed: 2026-05-07

    ClawHub.ClawHub. Accessed: 2026-05-07. 2026. 10

  16. [16]

    Securing AI agents with information-flow control,

    Manuel Costa, Boris K ¨opf, Aashish Kolluri, Andrew Paverd, Mark Russinovich, Ahmed Salem, Shruti Tople, Lukas Wutschitz, and Santiago Zanella-B ´eguelin. “Securing AI Agents with Information-Flow Control”. In: (2025). arXiv:2505.23643 [cs.CR]

  17. [17]

    American Banker

    Penny Crosman.AI agents are going rogue: Here’s what banks can do about it. American Banker. Apr. 24, 2026.URL:https://www.americanbanker.com/news/ai-agents-are-going-rogue- heres-what-banks-can-do-about-it(visited on 04/29/2026)

  18. [18]

    Teal- ium

    Mike Curry.Agents Don’t Wait: How Agent-Based Systems Change Data Latency Requirements. Teal- ium. May 23, 2024.URL:https : / / tealium . com / blog / artificial - intelligence - ai / agents- dont- wait- how- agent- based- systems- change- data- latency- requirements/ (visited on 04/29/2026)

  19. [19]

    Defeating Prompt Injections by Design

    Edoardo Debenedetti, Ilia Shumailov, Tianqi Fan, Jamie Hayes, Nicholas Carlini, Daniel Fabian, Christoph Kern, Chongyang Shi, Andreas Terzis, and Florian Tram `er. “Defeating Prompt Injections by Design”. In: (2025). arXiv:2503.18813 [cs.CR]

  20. [20]

    The San Francisco Standard

    Julia Flynn.OpenClaw Goes Rogue: The Security Crisis Unfolding in San Francisco’s AI Scene. The San Francisco Standard. Feb. 25, 2026.URL:https://sfstandard.com/2026/02/25/openclaw- goes-rogue/(visited on 04/29/2026)

  21. [21]

    Accessed: 2026-04-30

    Google.Jules FAQ. Accessed: 2026-04-30. 2026.URL:https://jules.google/

  22. [22]

    Google.URL:https : / / gemini

    Google.Gemini Overview - Agent. Google.URL:https : / / gemini . google / overview / agent/ (visited on 04/29/2026)

  23. [23]

    Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection

    Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. “Not what you’ve signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection”. In:arXiv(2023). eprint:2302.12173(cs.CR)

  24. [24]

    Grith Team

    Grith Team.A GitHub Issue Title Compromised 4,000 Developer Machines. Grith Team. Mar. 5, 2026. URL:https : / / grith . ai / blog / clinejection - when - your - ai - tool - installs - another (visited on 05/04/2026)

  25. [25]

    A Survey on LLM-as-a-Judge

    Jiawei Gu, Xuhui Jiang, Zhichao Shi, Hexiang Tan, Xuehao Zhai, Chengjin Xu, Wei Li, Yinghan Shen, Shengjie Ma, Honghao Liu, Saizhuo Wang, Kun Zhang, Yuanzhuo Wang, Wen Gao, Lionel Ni, and Jian Guo. “A Survey on LLM-as-a-Judge”. In: (2025). arXiv:2411.15594 [cs.CL]

  26. [26]

    Accessed: 2026-05-01

    Ikonomos Inc.Skyvern – AI-Powered Browser Automation for Any Website. Accessed: 2026-05-01. 2026.URL:https://www.skyvern.com/

  27. [27]

    Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations

    Hakan Inan, Kartikeya Upasani, Jianfeng Chi, Rashi Rungta, Krithika Iyer, Yuning Mao, Michael Tontchev, Qing Hu, Brian Fuller, Davide Testuggine, and Madian Khabsa. “Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations”. In:arXiv(2023). eprint:2312.06674(cs.CL)

  28. [28]

    Optimizing Agent Planning for Security and Autonomy

    Aashish Kolluri, Rishi Sharma, Manuel Costa, Boris K ¨opf, Tobias Nießen, Mark Russinovich, Shruti Tople, and Santiago Zanella-Beguelin. “Optimizing Agent Planning for Security and Autonomy”. In: The Fourteenth International Conference on Learning Representations. 2026

  29. [29]

    2024.URL:https://python.langchain

    LangChain.Hugging Face prompt injection identification. 2024.URL:https://python.langchain. com/v0.1/docs/guides/productionization/safety/hugging_face_prompt_injection/

  30. [30]

    ACE: A Security Architecture for LLM-Integrated App Systems

    Evan Li, Tushin Mallick, Evan Rose, William Robertson, Alina Oprea, and Cristina Nita-Rotaru. “ACE: A Security Architecture for LLM-Integrated App Systems”. In: (2025). arXiv:2504.20984 [cs.CR]

  31. [31]

    Accessed: 2026- 04-29

    Jon Martindale.Meta Security Researcher’s AI Agent Accidentally Deleted Her Emails. Accessed: 2026- 04-29. Feb. 2026.URL:https://www.pcmag.com/news/meta-security-researchers-opencla w-ai-agent-accidentally-deleted-her-emails

  32. [32]

    Mi- crosoft

    Microsoft Dynamics 365 Team.Measuring AI Agent Performance: Key Metrics and Benchmarks. Mi- crosoft. Feb. 4, 2026.URL:https://www.microsoft.com/en- us/dynamics- 365/blog/it- professional/2026/02/04/ai-agent-performance-measurement/(visited on 04/29/2026)

  33. [33]

    Fully Autonomous AI Agents Should Not be Developed

    Margaret Mitchell, Avijit Ghosh, Alexandra Sasha Luccioni, and Giada Pistilli. “Fully Autonomous AI Agents Should Not be Developed”. In:CoRRabs/2502.02649 (2025). arXiv:2502.02649

  34. [34]

    Accessed: 2026-04-30

    NanoCo.NanoClaw: Secure Personal AI Agent. Accessed: 2026-04-30. 2026

  35. [35]

    Ben Nassi, Stav Cohen, and Or Yair.Invitation Is All You Need! Promptware Attacks Against LLM- Powered Assistants in Production Are Practical and Dangerous. 2025. arXiv:2508.12175 [cs.CR]

  36. [36]

    Accessed: May 2, 2026

    OpenAI.A practical guide to building AI agents. Accessed: May 2, 2026. OpenAI. 2024.URL:https: //openai.com/business/guides-and-resources/a-practical-guide-to-building-ai- agents/

  37. [37]

    Accessed: 2026-05-07

    OpenAI.Introducing Codex. Accessed: 2026-05-07. May 2025

  38. [38]

    OpenAI.URL:https : / / chatgpt

    OpenAI.ChatGPT Features - Agent. OpenAI.URL:https : / / chatgpt . com / features / agent/ (visited on 04/29/2026)

  39. [39]

    OpenClaw.URL:https://openclaw.ai/(visited on 04/29/2026)

    OpenClaw.OpenClaw — Personal AI Assistant. OpenClaw.URL:https://openclaw.ai/(visited on 04/29/2026)

  40. [40]

    Formal Policy Enforcement for Real-World Agentic Systems

    Nils Palumbo, Sarthak Choudhary, Jihye Choi, Prasad Chalasani, and Somesh Jha. “Policy Compiler for Secure Agentic Systems”. In: (2026). arXiv:2602.16708 [cs.CR]. 11

  41. [41]

    Ignore Previous Prompt: Attack Techniques For Language Models

    F ´abio Perez and Ian Ribeiro. “Ignore Previous Prompt: Attack Techniques For Language Models”. In: arXiv(2022). eprint:2211.09527(cs.CL)

  42. [42]

    Do users write more insecure code with ai assistants?

    Neil Perry, Megha Srivastava, Deepak Kumar, and Dan Boneh. “Do users write more insecure code with ai assistants?” In:Proceedings of the 2023 ACM SIGSAC conference on computer and communications security. 2023, pages 2785–2799

  43. [43]

    Niels Provos Blog

    Niels Provos.IronCurtain: A Personal AI Assistant Built Secure from the Ground Up. Niels Provos Blog. Feb. 26, 2026.URL:https : / / www . provos . org / p / ironcurtain - secure - personal - assistant/(visited on 03/25/2026)

  44. [44]

    Business Insider

    Rashmi Ramesh.AI Agent Wipes Startup’s Data in 9-Second API Call. Business Insider. Apr. 28, 2026. URL:https://www.businessinsider.com/pocketos-cursor-ai-agent-deleted-producti on-database-startup-railway-2026-4(visited on 04/29/2026)

  45. [45]

    Accessed: 2026-05-

    Eric Ravenscraft.The Best Google Assistant Skills to Use With Your Google Home. Accessed: 2026-05-

  46. [46]

    2017.URL:https://lifehacker.com/the- best- google- assistant- skills- to- use- with-your-googl-1792134538

  47. [47]

    Sherlock: Reliable and Efficient Agentic Workflow Execu- tion

    Yeonju Ro, Haoran Qiu, ´I˜nigo Goiri, Rodrigo Fonseca, Ricardo Bianchini, Aditya Akella, Zhangyang Wang, Mattan Erez, and Esha Choukse. “Sherlock: Reliable and Efficient Agentic Workflow Execu- tion”. In:arXiv preprint arXiv:2511.00330(2025)

  48. [48]

    The New York Times

    Kevin Roose.The Year of the Agent: How AI is Moving from Chatting to Doing. The New York Times. Mar. 19, 2026.URL:https : / / www . nytimes . com / 2026 / 03 / 19 / technology / ai - agents - uses.html(visited on 04/29/2026)

  49. [49]

    Google Cloud

    Shubham Saboo.The KPIs that actually matter for production AI agents. Google Cloud. Feb. 26, 2026. URL:https : / / cloud . google . com / transform / the - kpis - that - actually - matter - for - production-ai-agents(visited on 04/29/2026)

  50. [50]

    AC4A: Access Control for Agents

    Reshabh K Sharma and Dan Grossman. “AC4A: Access Control for Agents”. In:arXiv preprint arXiv:2603.20933(2026)

  51. [51]

    Progent: Programmable privilege control for llm agents.arXiv preprint arXiv:2504.11703, 2025

    Tianneng Shi, Jingxuan He, Zhun Wang, Linyu Wu, Hongwei Li, Wenbo Guo, and Dawn Song. “Pro- gent: Programmable Privilege Control for LLM Agents”. In: (2025). arXiv:2504.11703 [cs.CR]

  52. [52]

    Accessed: 2026- 05-07

    SkillsMP.Agent Skills Marketplace - Claude, Codex & ChatGPT Skills — SkillsMP. Accessed: 2026- 05-07. 2026

  53. [53]

    An AI Agent Execution Environment to Safeguard User Data

    Robert Stanley, Avi Verma, Lillian Tsai, Konstantinos Kallas, and Sam Kumar. “An AI Agent Execution Environment to Safeguard User Data”. In:arXiv preprint arXiv:2604.19657(2026)

  54. [54]

    Specifications: The miss- ing link to making the development of LLM systems an engineering discipline

    Ion Stoica, Matei Zaharia, Joseph Gonzalez, Ken Goldberg, Koushik Sen, Hao Zhang, Anastasios Angelopoulos, Shishir G Patil, Lingjiao Chen, Wei-Lin Chiang, et al. “Specifications: The miss- ing link to making the development of LLM systems an engineering discipline”. In:arXiv preprint arXiv:2412.05299(2024)

  55. [55]

    Contextual Agent Security: A Policy for Every Purpose

    Lillian Tsai and Eugene Bagdasarian. “Contextual Agent Security: A Policy for Every Purpose”. In: Proceedings of the 20th Workshop on Hot Topics in Operating Systems. HotOS ’25. New York, NY , USA: Association for Computing Machinery, 2025, pages 100–112

  56. [56]

    The Cloudflare Blog

    Kenton Varda, Sunil Pai, and Ketan Gupta.Sandboxing AI agents, 100x faster. The Cloudflare Blog. Mar. 24, 2026.URL:https://blog.cloudflare.com/dynamic-workers/(visited on 03/27/2026)

  57. [57]

    Accessed: 2026-05-07

    V oltAgent.Awesome OpenClaw Skills. Accessed: 2026-05-07. 2026

  58. [58]

    The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

    Eric Wallace, Kai Xiao, Reimar Leike, Lilian Weng, Johannes Heidecke, and Alex Beutel. “The In- struction Hierarchy: Training LLMs to Prioritize Privileged Instructions”. In:arXiv(2024). eprint: 2404.13208(cs.CR)

  59. [59]

    2023.URL:https://simonwillison.net/ 2022/Sep/12/prompt-injection

    Simon Willison.Prompt injection attacks against GPT-3. 2023.URL:https://simonwillison.net/ 2022/Sep/12/prompt-injection

  60. [60]

    Simon Willison.The Dual LLM pattern for building AI assistants that can resist prompt injection. 2024. URL:https://simonwillison.net/2023/Apr/25/dual-llm-pattern/

  61. [61]

    System-Level Defense against Indirect Prompt Injection Attacks: An Information Flow Control Perspective

    Fangzhou Wu, Ethan Cecchetti, and Chaowei Xiao. “System-Level Defense against Indirect Prompt Injection Attacks: An Information Flow Control Perspective”. In: (2024). arXiv:2409.19091 [cs.CR]

  62. [62]

    IsolateGPT: An Exe- cution Isolation Architecture for LLM-Based Agentic Systems

    Yuhao Wu, Franziska Roesner, Tadayoshi Kohno, Ning Zhang, and Umar Iqbal. “IsolateGPT: An Exe- cution Isolation Architecture for LLM-Based Agentic Systems”. In:Proceedings of the 32nd Network and Distributed System Security Symposium (NDSS). 2025

  63. [63]

    Privacy Reasoning in Ambiguous Contexts

    Ren Yi, Octavian Suciu, Adria Gascon, Sarah Meiklejohn, Eugene Bagdasarian, and Marco Gruteser. “Privacy Reasoning in Ambiguous Contexts”. In:The Thirty-ninth Annual Conference on Neural Infor- mation Processing Systems. 2026

  64. [64]

    MiniScope: A Least Privilege Framework for Authorizing Tool Calling Agents

    Jinhao Zhu, Kevin Tseng, Gil Vernik, Xiao Huang, Shishir G. Patil, Vivian Fang, and Raluca Ada Popa. “MiniScope: A Least Privilege Framework for Authorizing Tool Calling Agents”. In: (2025). arXiv: 2512.11147 [cs.CR]. 12

  65. [65]

    ASIDE: Architectural Separation of Instructions and Data in Lan- guage Models

    Egor Zverev, Evgenii Kortukov, Alexander Panfilov, Soroush Tabesh, Sebastian Lapuschkin, Wojciech Samek, and Christoph H Lampert. “ASIDE: Architectural Separation of Instructions and Data in Lan- guage Models”. In:ICLR 2025 Workshop on Building Trust in Language Models and Applications. 13