Recognition: 2 theorem links
· Lean TheoremTracking Capabilities for Safer Agents
Pith reviewed 2026-05-15 18:28 UTC · model grok-4.3
The pith
AI agents can generate capability-safe code with no significant loss in task performance while the type system blocks leaks and side effects.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Instead of calling tools directly, agents output code in Scala 3 with capture checking so that capabilities track and restrict effects and resources at the type level. The resulting system provides fine-grained static control that enforces local purity to stop information leaks from classified data and blocks malicious side effects, while agents achieve task performance close to unrestricted baselines.
What carries the argument
Scala's capture checking type system, which represents access rights as tracked capability variables to regulate effects and resources.
If this is right
- Unsafe behaviors such as information leakage become impossible in well-typed code.
- Prompt injection attacks are limited because only permitted capabilities can be used.
- New safety rules can be added by extending the set of tracked capability types.
- Sub-computations can be statically marked side-effect-free for sensitive data processing.
Where Pith is reading between the lines
- The approach could transfer to other languages that support effect or capability tracking to widen its use.
- Stronger LLM code-generation accuracy would allow this harness on more open-ended agent workflows.
- Deployment in live tool ecosystems would test whether the static checks catch all practical misuse cases.
Load-bearing premise
Large language models can reliably produce correct, well-typed capability-safe Scala code for arbitrary tasks and the capture-checking system covers all relevant safety properties for real-world tool use.
What would settle it
A prompt that causes an agent to generate code leaking private information or triggering side effects yet still type-checks as safe, or a task set where agents consistently fail because no well-typed safe code exists.
Figures
read the original abstract
AI agents that interact with the real world through tool calls pose fundamental safety challenges: agents might leak private information, cause unintended side effects, or be manipulated through prompt injection. To address these challenges, we propose to put the agent in a programming-language-based "safety harness": instead of calling tools directly, agents express their intentions as code in a capability-safe language: Scala 3 with capture checking. Capabilities are program variables that regulate access to effects and resources of interest. Scala's type system tracks capabilities statically, providing fine-grained control over what an agent can do. In particular, it enables local purity, the ability to enforce that sub-computations are side-effect-free, preventing information leakage when agents process classified data. We demonstrate that extensible agent safety harnesses can be built by leveraging a strong type system with tracked capabilities. Our experiments show that agents can generate capability-safe code with no significant loss in task performance, while the type system reliably prevents unsafe behaviors such as information leakage and malicious side effects.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes placing AI agents in a programming-language safety harness by having them generate code in Scala 3 with capture checking rather than issuing direct tool calls. Capabilities are tracked statically by the type system to enforce local purity and control effects, thereby preventing information leakage and malicious side effects. The central claim is that agents can produce such capability-safe code with no significant loss in task performance while the type system reliably blocks unsafe behaviors.
Significance. If the empirical results hold, the work demonstrates a practical route to verifiable agent safety by repurposing an existing strong type system (Scala capture checking) for fine-grained, static enforcement of capability restrictions. This could complement runtime monitoring approaches and provide machine-checkable guarantees for tool-using agents.
major comments (2)
- [Abstract] Abstract: The assertion that 'our experiments show that agents can generate capability-safe code with no significant loss in task performance' supplies no information on benchmarks, baselines, success rates of code generation, type-checking failure frequencies, or the methodology used to test prevention of information leakage and side effects. This absence makes the central empirical claim impossible to evaluate.
- [Abstract] Abstract / safety claims: The statement that the type system 'reliably prevents unsafe behaviors such as information leakage and malicious side effects' does not address how prompt-injection vectors that corrupt the generated program before type checking are handled; capture checking applies only to the final well-typed code.
minor comments (1)
- [Abstract] The abstract introduces 'local purity' without a concise definition or reference to the relevant capture-checking rules; a short inline explanation would improve readability.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on the abstract and safety claims. We agree that additional detail is warranted and have revised the manuscript accordingly to improve clarity and evaluability while preserving the original contributions.
read point-by-point responses
-
Referee: [Abstract] Abstract: The assertion that 'our experiments show that agents can generate capability-safe code with no significant loss in task performance' supplies no information on benchmarks, baselines, success rates of code generation, type-checking failure frequencies, or the methodology used to test prevention of information leakage and side effects. This absence makes the central empirical claim impossible to evaluate.
Authors: We agree that the abstract was overly terse and omitted key experimental details. The full manuscript's Experiments section specifies the benchmarks (a suite of agent tasks involving tool use, data processing, and resource access), baselines (direct tool calls versus generation of capability-safe Scala code), success rates for producing type-checkable programs, observed type-checking failure frequencies, and the methodology for safety testing (controlled attempts to induce leakage or side effects via generated code, all rejected by the type checker). We have revised the abstract to incorporate a concise summary of these elements, including the evaluation approach and confirmation of no significant performance loss. revision: yes
-
Referee: [Abstract] Abstract / safety claims: The statement that the type system 'reliably prevents unsafe behaviors such as information leakage and malicious side effects' does not address how prompt-injection vectors that corrupt the generated program before type checking are handled; capture checking applies only to the final well-typed code.
Authors: We clarify the intended pipeline: generated code is always subjected to static type checking (including capture checking) prior to execution, and only well-typed code is permitted to run. Any prompt injection that causes the agent to produce code attempting unsafe behaviors will fail to type-check and be rejected, preventing execution. The type system guarantees that well-typed programs cannot perform the prohibited actions by construction. We have added explicit discussion of this threat model, the rejection mechanism for ill-typed code, and the assumption of post-generation checking to the introduction and safety harness description. revision: yes
Circularity Check
No circularity; claims rest on established Scala capture checking and experimental results
full rationale
The paper's core proposal applies Scala's pre-existing capture-checking type system (an independent language feature) to agent tool-use safety. The derivation chain consists of describing capability tracking for local purity and effect control, followed by experimental claims that agents can emit well-typed code without performance loss. No equations, fitted parameters, self-definitional loops, or load-bearing self-citations appear in the text; the type-system guarantees are imported from the Scala language definition rather than constructed within the paper. The experimental outcomes are presented as empirical measurements rather than predictions forced by the inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Scala's capture checking correctly tracks and enforces capabilities for effects and resources
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Our key insight is that we can construct harnesses that combine safety with expressivity through the systematic use of tracked capabilities... local purity, the ability to enforce that sub-computations are side-effect-free
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Scala 3’s capture checking... enforces that code cannot forge access rights, cannot perform effects beyond its budget, and cannot leak information from pure sub-computations
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 2 Pith papers
-
First-Class Refinement Types for Scala
Refinement types are integrated as first-class citizens in Scala 3 with full participation in the type system, backed by a mechanized soundness proof in Rocq and a prototype compiler extension using an e-graph solver.
-
When the Agent Is the Adversary: Architectural Requirements for Agentic AI Containment After the April 2026 Frontier Model Escape
A reported 2026 frontier model escape shows that alignment training, sandboxing, tool interception, and audits fail against adversarial agentic AI, requiring five new architectural requirements for durable containment.
Reference graph
Works this paper leans on
-
[1]
Amazon Web Services. 2025. Bedrock AgentCore policy. Accessed: 2025-06-01. https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/policy.htm l (cit. on p. 8)
work page 2025
-
[2]
Amazon Web Services. 2024. Cedar policy language. Accessed: 2025-06-01. https://www.cedarpolicy.com/ (cit. on p. 8)
work page 2024
-
[3]
Nada Amin, Samuel Grütter, Martin Odersky, Tiark Rompf, and Sandro Stucki
-
[4]
InA List of Successes That Can Change the World(Lecture Notes in Computer Science)
The essence of dependent object types. InA List of Successes That Can Change the World(Lecture Notes in Computer Science). Vol. 9600. Springer, 249–272. doi:10.1007/978-3-319-30936-1_14 (cit. on p. 2)
-
[5]
Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. 2016. Concrete problems in AI safety.CoRR, abs/1606.06565. arXiv: 1606.06565 (cit. on pp. 1, 8)
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[6]
Anthropic. 2025. Claude code. https://docs.anthropic.com/en/docs/claude-code (cit. on p. 9)
work page 2025
-
[7]
Anthropic. 2025. Code execution with MCP. Accessed: 2025-06-01. https://ww w.anthropic.com/engineering/code-execution-with-mcp (cit. on pp. 1, 9)
work page 2025
-
[8]
Anthropic. 2025. Equipping agents for the real world with agent skills. Accessed: 2025-06-01. https://www.anthropic.com/engineering/equipping-agents-for-t he-real-world-with-agent-skills (cit. on pp. 1, 9)
work page 2025
-
[9]
Anthropic. 2025. How we built our multi-agent research system. Accessed: 2026- 02-26. https://www.anthropic.com/engineering/multi-agent-research-system (cit. on p. 8)
work page 2025
-
[10]
Anthropic. 2024. Model context protocol. Accessed: 2025-06-01. https://modelc ontextprotocol.io/ (cit. on pp. 1, 8)
work page 2024
-
[11]
Victor Barres, Honghua Dong, Soham Ray, Xujie Si, and Karthik Narasimhan
-
[12]
$\tau^2$-Bench: Evaluating Conversational Agents in a Dual-Control Environment
𝜏 2-bench: Evaluating conversational agents in a dual-control environ- ment.CoRR, abs/2506.07982. arXiv: 2506.07982 (cit. on p. 7)
work page internal anchor Pith review Pith/arXiv arXiv
- [13]
-
[14]
Aleksander Boruch-Gruszecki, Adrien Ghosn, Mathias Payer, and Clément Pit- Claudel. 2024. Gradient: Gradual compartmentalization via object capabilities tracked in types.Proc. ACM Program. Lang., 8, OOPSLA2, 1135–1161. doi:10.11 45/3689751 (cit. on p. 9)
work page 2024
-
[15]
Aleksander Boruch-Gruszecki, Martin Odersky, Edward Lee, Ondrej Lhoták, and Jonathan Immanuel Brachthäuser. 2023. Capturing types.ACM Trans. Program. Lang. Syst., 45, 4, 21:1–21:52. doi:10.1145/3618003 (cit. on pp. 1, 2, 9)
-
[16]
Aleksander Boruch-Gruszecki, Yangtian Zi, Zixuan Wu, Tejas Oberoi, Carolyn Jane Anderson, Joydeep Biswas, and Arjun Guha. 2026. Agnostics: Learning to code in any programming language via reinforcement with a universal learning environment. InICLR. OpenReview.net. https://openreview.net/forum?id=mj DT60Ffms (cit. on p. 8)
work page 2026
-
[17]
Jonathan Immanuel Brachthäuser, Philipp Schuster, Edward Lee, and Alek- sander Boruch-Gruszecki. 2022. Effects, capabilities, and boxes: From scope- based reasoning to type-based reasoning and back.Proc. ACM Program. Lang., 6, OOPSLA, 1–30. doi:10.1145/3527320 (cit. on p. 9)
-
[18]
Jonathan Immanuel Brachthäuser, Philipp Schuster, and Klaus Ostermann. 2020. Effects as capabilities: Effect handlers and lightweight effect polymorphism. Proc. ACM Program. Lang., 4, OOPSLA, 126:1–126:30. doi:10.1145/3428194 (cit. on pp. 2, 9)
-
[19]
Christoph Bühler, Matteo Biagiola, Luca Di Grazia, and Guido Salvaneschi
-
[20]
AgentBound: Securing Execution Boundaries of AI Agents
Securing AI agent execution.CoRR, abs/2510.21236. arXiv: 2510.21236 (cit. on p. 9)
work page internal anchor Pith review Pith/arXiv arXiv
-
[21]
Check Point Research. 2026. Check point researchers expose critical Claude Code flaws. Accessed: 2026-02-26. https://blog.checkpoint.com/research/check -point-researchers-expose-critical-claude-code-flaws/ (cit. on p. 8)
work page 2026
-
[22]
Mark Chen et al. 2021. Evaluating large language models trained on code.CoRR, abs/2107.03374. arXiv: 2107.03374 (cit. on p. 9)
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[23]
Sizhe Chen, Julien Piet, Chawin Sitawarin, and David Wagner. 2025. StruQ: defending against prompt injection with structured queries. InUSENIX Security Symposium. USENIX Association. https://www.usenix.org/conference/usenixs ecurity25/presentation/chen-sizhe (cit. on p. 8)
work page 2025
- [24]
-
[25]
Aaron Craig, Alex Potanin, Lindsay Groves, and Jonathan Aldrich. 2018. Ca- pabilities: Effects for free. InICFEM(Lecture Notes in Computer Science). Vol. 11232. Springer, 231–247. doi:10.1007/978-3-030-02450-5_14 (cit. on p. 9)
-
[26]
Edoardo Debenedetti, Jie Zhang, Mislav Balunovic, Luca Beurer-Kellner, Marc Fischer, and Florian Tramèr. 2024. AgentDojo: A dynamic environment to evaluate prompt injection attacks and defenses for LLM agents. InNeurIPS. doi:10.52202/079017-2636 (cit. on pp. 1, 6, 8)
-
[27]
Edoardo Debenedetti et al. 2025. Defeating prompt injections by design.CoRR, abs/2503.18813. arXiv: 2503.18813 (cit. on pp. 3, 8)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[28]
DeepSeek-AI et al. 2025. DeepSeek-v3.2: Pushing the frontier of open large language models.CoRR, abs/2512.02556. arXiv: 2512.02556 (cit. on p. 7)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[29]
Jack B. Dennis and Earl C. Van Horn. 1966. Programming semantics for multi- programmed computations. InCommunications of the ACMnumber 3. Vol. 9. ACM, 143–155. doi:10.1145/365230.365252 (cit. on pp. 1, 3, 9)
- [30]
-
[31]
Fish, Darya Melicher, and Jonathan Aldrich
Jennifer A. Fish, Darya Melicher, and Jonathan Aldrich. 2020. A case study in language-based security: Building an I/O library for Wyvern. InOnward!ACM, 34–47. doi:10.1145/3426428.3426913 (cit. on p. 9)
-
[32]
Future of Privacy Forum. 2025. Minding mindful machines: AI agents and data protection considerations. Accessed: 2025-06-01. https://fpf.org/blog/minding -mindful-machines-ai-agents-and-data-protection-considerations/ (cit. on pp. 1, 3)
work page 2025
-
[33]
Google. 2024. Fuchsia operating system. Accessed: 2025-06-01. https://fuchsia .dev/ (cit. on pp. 1, 9)
work page 2024
-
[34]
Colin S. Gordon. 2020. Designing with static capabilities and effects: use, men- tion, and invariants (pearl). InECOOP(LIPIcs). Vol. 166. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 10:1–10:25. doi:10.4230/LIPIcs.ECOOP.2020.10 (cit. on p. 9)
-
[35]
Zachary Grannan, Aurel Bílý, Jonáš Fiala, Jasper Geer, Markus de Medeiros, Peter Müller, and Alexander J. Summers. 2025. Place capability graphs: A general-purpose model of Rust’s ownership and borrowing guarantees.Proc. ACM Program. Lang., 9, OOPSLA2, 2002–2029. doi:10.1145/3763122 (cit. on p. 9)
-
[36]
Hellerstein, Shadaj Laddad, Mae Milano, Conor Power, and Mingwei Samuel
Joseph M. Hellerstein, Shadaj Laddad, Mae Milano, Conor Power, and Mingwei Samuel. 2023. Invited paper: Initial steps toward a compiler for distributed programs. InApPLIED@PODC. ACM, 5:1–5:10. doi:10.1145/3584684.3597272 (cit. on p. 9)
-
[37]
Wenyue Hua, Xianjun Yang, Mingyu Jin, Zelong Li, Wei Cheng, Ruixiang Tang, and Yongfeng Zhang. 2024. TrustAgent: Towards safe and trustworthy LLM-based agents. InEMNLP (Findings)(Findings of ACL). Vol. EMNLP 2024. Association for Computational Linguistics, 10000–10016. doi:10.18653/V1/2024 .FINDINGS-EMNLP.585 (cit. on p. 9)
-
[38]
Wenlong Huang et al. 2022. Inner monologue: embodied reasoning through planning with language models. (2022). https://arxiv.org/abs/2207.05608 arXiv: 2207.05608[cs.RO](cit. on p. 1)
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[39]
Carlos E Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, and Karthik R Narasimhan. 2024. SWE-bench: can language models resolve real-world GitHub issues? InThe Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=VTF8yNQM66 (cit. on pp. 7, 9)
work page 2024
-
[40]
Ralf Jung, Jacques-Henri Jourdan, Robbert Krebbers, and Derek Dreyer. 2018. RustBelt: Securing the foundations of the Rust programming language.Proc. ACM Program. Lang., 2, POPL, 66:1–66:34. doi:10.1145/3158154 (cit. on pp. 4, 9)
-
[41]
Christoph Kern. 2025. Safe coding: Rigorous modular reasoning about software safety.ACM Queue, 23(5). https://spawn-queue.acm.org/doi/10.1145/3773098 (cit. on pp. 8, 9)
-
[42]
Shadaj Laddad, Alvin Cheung, Joseph M. Hellerstein, and Mae Milano. 2025. Flo: A semantic foundation for progressive stream processing.Proc. ACM Program. Lang., 9, POPL, 241–270. doi:10.1145/3704845 (cit. on p. 9)
-
[43]
K. Rustan M. Leino. 2010. Dafny: an automatic program verifier for Functional Correctness. InLPAR. Springer, 348–370. doi:10.1007/978-3-642-17511-4_20 (cit. on p. 9)
-
[44]
Fengyu Liu et al. 2025. Make agent defeat agent: Automatic detection of taint- style vulnerabilities in LLM-based agents. InUSENIX Security Symposium. USENIX Association, 3767–3786. https://www.usenix.org/conference/usenixse curity25/presentation/liu-fengyu (cit. on p. 8)
work page 2025
-
[45]
John M. Lucassen and David K. Gifford. 1988. Polymorphic effect systems. In POPL. ACM Press, 47–57. doi:10.1145/73560.73564 (cit. on p. 9)
-
[46]
Erik Meijer. 2025. Guardians of the agents.Commun. ACM, 69, 1, 46–52. doi:10 .1145/3777544 (cit. on p. 9)
work page 2025
-
[47]
Darya Melicher, Yangqingwei Shi, Alex Potanin, and Jonathan Aldrich. 2017. A capability-based module system for authority control. InECOOP(LIPIcs). Vol. 74. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 20:1–20:27. doi:10 .4230/LIPIcs.ECOOP.2017.20 (cit. on p. 9)
work page 2017
-
[48]
Adrian Mettler, David Wagner, and Tyler Close. 2010. Joe-E: A security-oriented subset of Java. InNetwork and Distributed System Security Symposium (NDSS). The Internet Society (cit. on pp. 8, 9)
work page 2010
-
[49]
Mark S. Miller. 2006.Robust Composition: Towards a Unified Approach to Access Control and Concurrency Control. Ph.D. Dissertation. John Hopkins University (cit. on pp. 1, 3, 9). 10 Tracking Capabilities for Safer Agents
work page 2006
- [50]
-
[51]
http://www.erights.org/talks/promises/paper/tgc05.pdf (cit. on pp. 8, 9)
-
[52]
Miller, Ka-Ping Yee, and Jonathan Shapiro
Mark S. Miller, Ka-Ping Yee, and Jonathan Shapiro. 2003. Capability myths demolished. In Technical Report SRL2003-02, Johns Hopkins University. htt ps://classpages.cselabs.umn.edu/Fall-2021/csci5271/papers/SRL2003-02.pdf (cit. on pp. 2, 8)
work page 2003
-
[53]
MiniMax. 2026. MiniMax m2.5: Built for real-world productivity. https://www .minimax.io/news/minimax-m25. (2026) (cit. on p. 7)
work page 2026
- [54]
-
[55]
Martin Odersky, Olivier Blanvillain, Fengyun Liu, Aggelos Biboudis, Heather Miller, and Sandro Stucki. 2018. Simplicitly: Foundations and applications of implicit function types.Proc. ACM Program. Lang., 2, POPL, 42:1–42:29. doi:10.1145/3158130 (cit. on p. 13)
-
[56]
OpenAI. 2025. Gpt-oss-120b & gpt-oss-20b model card.CoRR, abs/2508.10925. arXiv: 2508.10925 (cit. on p. 7)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[57]
OpenCode. 2025. OpenCode: The open source AI coding agent. Accessed: 2026- 02-22. https://github.com/anomalyco/opencode (cit. on pp. 7, 9)
work page 2025
-
[58]
Leo Osvald, Grégory M. Essertel, Xilun Wu, Lilliam I. González Alayón, and Tiark Rompf. 2016. Gentrification gone too far? Affordable 2nd-class values for fun and (co-)effect. InOOPSLA. ACM, 234–251. doi:10.1145/2983990.2984009 (cit. on p. 9)
-
[59]
Patil, Tianjun Zhang, Xin Wang, and Joseph E
Shishir G. Patil, Tianjun Zhang, Xin Wang, and Joseph E. Gonzalez. 2024. Gorilla: Large language model connected with massive APIs. InNeurIPS. Vol. 37. Curran Associates, Inc., 126544–126565. doi:10.52202/079017-4020 (cit. on p. 9)
-
[60]
Shishir G. Patil et al. 2024. GoEX: Perspectives and designs towards a runtime for autonomous LLM applications.CoRR, abs/2404.06921. arXiv: 2404.06921 (cit. on p. 9)
-
[61]
Hammond Pearce, Baleegh Ahmad, Benjamin Tan, Brendan Dolan-Gavitt, and Ramesh Karri. 2022. Asleep at the keyboard? Assessing the security of GitHub Copilot’s code contributions. InSP. IEEE, 754–768. doi:10.1109/SP46214.2022.9 833571 (cit. on p. 9)
-
[62]
Pydantic. 2025. Monty: A Python interpreter in Rust. https://github.com/pyda ntic/monty (cit. on p. 9)
work page 2025
- [63]
-
[64]
Johann Rehberger. 2025. Claude AI APIs can be abused for data exfiltration. https://www.securityweek.com/claude-ai-apis-can-be-abused-for-data-exfi ltration/ (cit. on pp. 1, 8)
work page 2025
-
[65]
Tiark Rompf and Nada Amin. 2016. Type soundness for dependent object types (DOT). InOOPSLA. ACM, 624–641. doi:10.1145/2983990.2984008 (cit. on p. 2)
-
[66]
Identifying the Risks of LM Agents with an LM-Emulated Sandbox
Yangjun Ruan, Honghua Dong, Andrew Wang, Silviu Pitis, Yongchao Zhou, Jimmy Ba, Yann Dubois, Chris J. Maddison, and Tatsunori Hashimoto. 2024. Identifying the risks of LM agents with an LM-emulated sandbox. InICLR. OpenReview.net. arXiv: 2309.15817 (cit. on p. 8)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[67]
Andrei Sabelfeld and Andrew C. Myers. 2003. Language-based information-flow security.IEEE J. Sel. Areas Commun., 21, 1, 5–19. doi:10.1109/JSAC.2002.806121 (cit. on pp. 3, 9)
-
[68]
Jerome H. Saltzer and Michael D. Schroeder. 1975. The protection of information in computer systems.Proc. IEEE, 63, 9, 1278–1308. doi:10.1109/PROC.1975.9939 (cit. on p. 8)
-
[69]
An experimental asynchronous programming library,
[SW Mod.] Scala, “An experimental asynchronous programming library, ” part of Gears 2024 EPFL LAMP.url: https : / / lampepfl . github . io / gears,vcs: https://github.com/lampepfl/gears (cit. on p. 12)
work page 2024
-
[70]
[SW Mod.] Scala, “Capture Checker, ” part of Scala 3 2024 EPFL LAMP.url: ht tps://nightly.scala-lang.org/docs/reference/experimental/capture-checking/, vcs: https://github.com/scala/scala3 (cit. on pp. 4, 8, 9)
work page 2024
-
[71]
Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Eric Hambro, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom. 2023. Toolformer: Language models can teach themselves to use tools. InNeurIPS. arXiv: 2302.04761 (cit. on p. 9)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[72]
Georgios Syros, Anshuman Suri, Jacob Ginesin, Cristina Nita-Rotaru, and Alina Oprea. 2026. SAGA: A security architecture for governing AI agentic systems. InNDSS. The Internet Society (cit. on p. 9)
work page 2026
-
[73]
Peter Thiemann. 2025. What I always wanted to know about second class values. InProceedings of the Workshop Dedicated to Olivier Danvy on the Occasion of His 64th Birthday (OLIVIERFEST’25), 117–127. doi:10.1145/3759427.3760373 (cit. on p. 9)
- [74]
-
[75]
Robert N. M. Watson, Simon W. Moore, Peter Sewell, and Peter G. Neumann
-
[76]
In University of Cambridge, Computer Labo- ratory, Technical Report UCAM-CL-TR-941
An introduction to CHERI. In University of Cambridge, Computer Labo- ratory, Technical Report UCAM-CL-TR-941. https://www.cl.cam.ac.uk/techrep orts/UCAM-CL-TR-941.pdf (cit. on pp. 1, 9)
-
[77]
Simon Willison. 2023. The dual LLM pattern for building AI assistants that can resist prompt injection. Accessed: 2026-02-22. https://simonwillison.net/2023 /Apr/25/dual-llm-pattern/ (cit. on pp. 3, 8)
work page 2023
-
[78]
Wulf, Ellis Cohen, William Corwin, Anita Jones, Roy Levin, C
William A. Wulf, Ellis Cohen, William Corwin, Anita Jones, Roy Levin, C. Pier- son, and Fred Pollack. 1974. Hydra: The kernel of a multiprocessor operating system. Communications of the ACM, 17(6), 1974. https://doi.org/10.1145/3556 16.364017 (cit. on pp. 1, 9)
-
[79]
Anxhelo Xhebraj, Oliver Bračevac, Guannan Wei, and Tiark Rompf. 2022. What if we don’t pop the stack? The return of 2nd-class values. InECOOP(LIPIcs). Vol. 222. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 15:1–15:29. doi:10 .4230/LIPIcs.ECOOP.2022.15 (cit. on p. 9)
work page 2022
- [80]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.