arxiv: 2605.11003 · v1 · submitted 2026-05-10 · 💻 cs.CR · cs.AI

Recognition: no theorem link

The Authorization-Execution Gap Is a Major Safety and Security Problem in Open-World Agents

Adel Bibi, Baoyuan Wu, Irwin King, Qingshan Liu, Siwei Lyu

Authors on Pith no claims yet

Pith reviewed 2026-05-13 01:16 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords authorization-execution gapopen-world agentsagent safetyruntime integritydelegationmulti-agent systemssecurity

0 comments

The pith

Open-world agents create an authorization-execution gap where intended permissions diverge from executed actions, producing hard-to-undo harm.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This position paper claims that open-world agents, which operate autonomously across tools, persistent state, and handoffs, routinely execute actions that diverge from what their principals intended to authorize. The resulting Authorization-Execution Gap can produce irreversible damage because the agents act in open environments where mistakes cannot be rolled back. The authors trace many observed failures to three structural sources: incomplete delegation of authority at the outset, corruption or loss along communication channels, and fragmentation when multiple actions or agents are composed together. Because the same failure can originate from any of these sources, defenses that only address symptoms fail to fix the underlying problem. The paper therefore argues that safety requires source-oriented diagnosis and integrity checks applied while the agent is running, not merely upfront filtering or later audits.

Core claim

The central claim is that the Authorization-Execution Gap (AEG) is a major safety and security problem in open-world agents. The AEG is the divergence between what a principal intends to authorize and what the agent ultimately executes. This divergence arises dynamically from three structural sources: delegation-level incompleteness, channel-level corruption, and composition-level fragmentation. The same observed failure may stem from any source, so defenses must diagnose the source during execution and apply runtime authorization integrity checks rather than relying on one-time upfront filters or post-hoc audits.

What carries the argument

The Authorization-Execution Gap, the divergence between intended authorization and actual execution, carried through three structural sources: delegation-level incompleteness, channel-level corruption, and composition-level fragmentation.

If this is right

Defenses must identify the structural source of any authorization divergence rather than treating symptoms alone.
Authorization integrity must be checked continuously during execution because the sources arise dynamically.
Papers on open-world agents should report process-level evidence of where AEG was detected, constrained, and attributed to a source.
The same failure can arise from any of the three sources, making source-agnostic metrics insufficient for security evaluation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Existing agent tool-use frameworks could be audited level-by-level to measure how often delegation incompleteness occurs in practice.
Multi-agent handoff protocols may need explicit authorization tokens passed between agents to reduce composition-level fragmentation.
Runtime checks could be tested by injecting controlled corruption into channels and measuring whether source diagnosis improves recovery rates.

Load-bearing premise

That most agent failures can be traced to these three structural sources and that source-oriented runtime integrity checks can be implemented without blocking useful agent behavior.

What would settle it

An open-world agent failure that cannot be attributed to delegation incompleteness, channel corruption, or composition fragmentation, or a working agent system that maintains safety using only upfront filters and post-hoc audits with no runtime checks.

Figures

Figures reproduced from arXiv: 2605.11003 by Adel Bibi, Baoyuan Wu, Irwin King, Qingshan Liu, Siwei Lyu.

**Figure 1.** Figure 1: , where nodes represent states and edges represent transitions. The structural sources of AEG introduced in later sections map to edges rather than nodes. In the following, we elaborate the path based on each edge. Edge 1: Delegation. The input to this edge is the principal’s intended task and intended authorization scope, which can not be directly observed by the agent. Thus, the principal encodes this in… view at source ↗

read the original abstract

This position paper argues that the Authorization-Execution Gap (AEG) is a major safety and security problem in open-world agents. The AEG is the divergence between what a principal intends to authorize and what an open-world agent ultimately executes. Because such agents act autonomously across tools, persistent state, and multi-agent handoffs, even small instances of authorization divergence can cause harm that is difficult or impossible to undo. We argue that many observed agent failures can be traced to three structural sources of AEG: delegation-level incompleteness, channel-level corruption, and composition-level fragmentation. The same observed failure may arise from any of these sources. Without identifying the source, a defense targeting the symptom alone cannot address the underlying cause. Agent safety and security should therefore emphasize source-oriented diagnosis and defense. Because the structural sources of AEG arise dynamically during execution, this approach necessarily requires authorization integrity checks applied during execution, rather than relying solely on one-shot upfront filtering or post-hoc audit. For NeurIPS, the implication is that papers on open-world agents should report not only outcome-level metrics such as task success or attack resistance, but also process-level evidence showing where AEG was detected, constrained, and attributed to a structural source during execution.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper names the Authorization-Execution Gap and ties it to three structural sources, but offers only a conceptual argument without data or examples.

read the letter

This paper's core claim is that open-world agents have an Authorization-Execution Gap that causes safety problems, and that gap comes from three specific structural sources. The authors want us to focus on diagnosing and fixing those sources at runtime rather than just checking upfront or after the fact. What stands out as new is the explicit breakdown into delegation-level incompleteness, channel-level corruption, and composition-level fragmentation. These give a way to categorize why an agent might execute something outside its authorization even if the initial setup looked fine. The paper does a decent job laying out why this matters for autonomous agents that use tools and hand off tasks, and why small divergences can lead to hard-to-reverse harm. It also makes a fair point about evaluation: instead of only reporting task success or attack success rates, papers should show evidence of where the gap was caught during execution and which source it came from. That could push the field toward better process tracking. The main limitation is that everything stays at the level of argument. There are no case studies, no analysis of existing agent failures to see if they fit these categories, and no sketch of how source-oriented checks would actually work in practice. The claim that these three sources explain many observed failures is plausible but not demonstrated, so it's not clear if addressing them would cover the real problems or if other factors dominate. Since it's a position paper, that's expected, but it leaves the reader wondering about feasibility and completeness. This kind of framing is useful for people building or studying autonomous agents in open settings, especially those already thinking about safety mechanisms. A reader who wants concrete methods or data won't get much, but someone looking for a new lens on existing issues might find it organizes their thinking. I would recommend sending it to peer review. The idea is clear enough to be worth discussing, even if it needs more development to become a stronger contribution.

Referee Report

2 major / 2 minor

Summary. This position paper introduces the Authorization-Execution Gap (AEG) as the divergence between a principal's intended authorization and what an open-world agent actually executes. It argues that AEG is a major safety and security problem because even small divergences can cause irreversible harm in autonomous, tool-using, multi-agent settings. The paper traces many observed failures to three structural sources—delegation-level incompleteness, channel-level corruption, and composition-level fragmentation—and concludes that defenses must be source-oriented and applied at runtime rather than relying solely on upfront filtering or post-hoc audits. It recommends that NeurIPS papers on open-world agents report process-level evidence of AEG detection, constraint, and source attribution.

Significance. If the framework holds, it offers a coherent conceptual lens for analyzing why current authorization mechanisms fall short in dynamic agent environments and could usefully redirect research toward runtime integrity diagnostics. The position correctly notes that symptom-focused defenses may miss root causes and that process-level reporting would complement existing outcome metrics such as task success or attack resistance. As a purely conceptual contribution without empirical cases, formal models, or implementation details, its significance lies in framing future work rather than in immediate technical advance.

major comments (2)

[Abstract / structural sources section] Abstract and the section introducing the three structural sources: the central claim that 'many observed agent failures can be traced to' delegation-level incompleteness, channel-level corruption, and composition-level fragmentation is load-bearing for the argument that source-oriented runtime checks are required, yet the manuscript supplies no case studies, failure traces, or references to concrete incidents to substantiate the tracing or to show that these three sources are comprehensive.
[runtime checks / defense implications section] The section arguing for runtime integrity checks: the assertion that dynamic sources of AEG 'necessarily require' execution-time checks (rather than static or post-hoc methods) is logically consistent with the premises but lacks any discussion of implementation feasibility, performance cost, or how such checks could be realized without unduly constraining useful agent behavior; this directly affects the practicality of the recommended defense strategy.

minor comments (2)

The three sources are introduced at a high level; brief illustrative examples or a diagram showing how each source produces AEG in a concrete agent workflow would improve clarity without altering the conceptual argument.
The NeurIPS recommendation paragraph could be expanded with one or two concrete examples of process-level metrics (e.g., 'fraction of tool calls where delegation incompleteness was detected at runtime') to make the reporting suggestion more actionable.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed review. The comments correctly identify areas where additional grounding and practicality considerations would strengthen the position paper. We respond to each major comment below and indicate the revisions we will make.

read point-by-point responses

Referee: [Abstract / structural sources section] Abstract and the section introducing the three structural sources: the central claim that 'many observed agent failures can be traced to' delegation-level incompleteness, channel-level corruption, and composition-level fragmentation is load-bearing for the argument that source-oriented runtime checks are required, yet the manuscript supplies no case studies, failure traces, or references to concrete incidents to substantiate the tracing or to show that these three sources are comprehensive.

Authors: We acknowledge that the manuscript does not currently include explicit case studies or failure traces. As a position paper, the three sources are proposed as a conceptual categorization derived from patterns across the existing agent safety and security literature rather than from new empirical analysis. In the revised version we will add targeted references to documented incidents and prior work illustrating each source (e.g., incomplete delegation in tool-use failures, channel corruption via prompt or state injection, and fragmentation in multi-agent handoffs). This will substantiate the tracing claim while preserving the paper's focus on framing rather than exhaustive validation. We will also note that the categorization is offered as a starting point open to refinement. revision: yes
Referee: [runtime checks / defense implications section] The section arguing for runtime integrity checks: the assertion that dynamic sources of AEG 'necessarily require' execution-time checks (rather than static or post-hoc methods) is logically consistent with the premises but lacks any discussion of implementation feasibility, performance cost, or how such checks could be realized without unduly constraining useful agent behavior; this directly affects the practicality of the recommended defense strategy.

Authors: We agree that feasibility considerations are important for the recommendation to be useful. The argument for runtime checks follows directly from the dynamic character of the three structural sources, which cannot be fully resolved by static analysis or post-hoc review alone. In the revision we will expand the defense section with a high-level discussion of implementation directions, including lightweight runtime monitors, selective checking based on risk level, and mechanisms for source attribution. We will also address potential performance trade-offs and the need to avoid over-constraining agent autonomy. Detailed designs, cost measurements, and evaluations remain outside the scope of this position paper and are left for future technical work. revision: partial

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is a position paper that advances a conceptual framework defining the Authorization-Execution Gap and tracing failures to three structural sources through logical argument. No equations, derivations, fitted parameters, or self-citations appear in the provided text. The central claims follow directly from stated premises without reducing to self-referential definitions or prior results by construction, making the reasoning self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 4 invented entities

The paper is a position paper that introduces a new conceptual framework without empirical data or formal proofs, relying on logical argumentation about the behavior of open-world agents.

axioms (1)

domain assumption Open-world agents act autonomously across tools, persistent state, and multi-agent handoffs.
This premise is stated directly in the abstract as the reason why small authorization divergences can cause significant harm.

invented entities (4)

Authorization-Execution Gap (AEG) no independent evidence
purpose: To name and frame the divergence between intended authorization and actual execution as a central safety problem.
Newly coined term introduced to organize discussion of agent failures.
delegation-level incompleteness no independent evidence
purpose: To categorize one structural source of the AEG.
Newly defined category within the proposed framework.
channel-level corruption no independent evidence
purpose: To categorize one structural source of the AEG.
Newly defined category within the proposed framework.
composition-level fragmentation no independent evidence
purpose: To categorize one structural source of the AEG.
Newly defined category within the proposed framework.

pith-pipeline@v0.9.0 · 5529 in / 1518 out tokens · 78699 ms · 2026-05-13T01:16:19.497575+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages · 2 internal anchors

[1]

Berkay Celik

Edoardo Allegrini, Ananth Shreekumar, and Z. Berkay Celik. Formalizing the safety, security, and functional properties of agentic ai systems. InICLR 2026 Workshop on Agentic AI for the Real World: Risks, Safety, and Responsible Innovation, 2026

work page 2026
[2]

Agentharm: A benchmark for measuring harmfulness of llm agents

Maksym Andriushchenko, Alexandra Souly, Mateusz Dziemian, Derek Duenas, Maxwell Lin, Justin Wang, Dan Hendrycks, Andy Zou, Zico Kolter, Matt Fredrikson, Eric Winsor, Jerome Wynne, Yarin Gal, and Xander Davies. Agentharm: A benchmark for measuring harmfulness of llm agents. InInternational Conference on Learning Representations, 2025

work page 2025
[3]

Securing agentic ai systems—a multilayer security framework

Sunil Arora and John Hastings. Securing agentic ai systems—a multilayer security framework. arXiv preprint arXiv:2512.18043, 2025

work page arXiv 2025
[4]

Constitutional AI: Harmlessness from AI Feedback

Yuntao Bai, Andy Jones, Kamal Ndousse, Amanda Askell, Anna Chen, Nova DasSarma, Dawn Drain, Stanislav Fort, Deep Ganguli, Tom Henighan, et al. Constitutional ai: Harmlessness from ai feedback.arXiv preprint arXiv:2212.08073, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[5]

Adam Bates, Dave (Jing) Tian, Kevin R. B. Butler, and Thomas Moyer. Trustworthy whole- system provenance for the linux kernel. In24th USENIX Security Symposium, 2015

work page 2015
[6]

Talisman: Tamper analysis for reference monitors

Frank Capobianco, Quan Zhou, Aditya Basu, Trent Jaeger, and Danfeng Zhang. Talisman: Tamper analysis for reference monitors. InNetwork and Distributed System Security Symposium (NDSS), 2024

work page 2024
[7]

Agentpoison: Red-teaming llm agents via poisoning memory or knowledge bases

Zhaorun Chen, Zhen Xiang, Chaowei Xiao, Dawn Song, and Bo Li. Agentpoison: Red-teaming llm agents via poisoning memory or knowledge bases. InAdvances in Neural Information Processing Systems, volume 37, 2024

work page 2024
[8]

Os-kairos: Adaptive interaction for mllm-powered gui agents

Pengzhou Cheng, Zheng Wu, Zongru Wu, Ju Tianjie, Aston Zhang, Zhuosheng Zhang, and Gongshen Liu. Os-kairos: Adaptive interaction for mllm-powered gui agents. InFindings of Proceedings of the 63th Annual Meeting of the Association for Computational Linguistics, 2025

work page 2025
[9]

Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for llm agents

Edoardo Debenedetti, Jie Zhang, Mislav Balunovic, Luca Beurer-Kellner, Marc Fischer, and Florian Tramèr. Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for llm agents. InAdvances in Neural Information Processing Systems, volume 37, 2024

work page 2024
[10]

Pentestgpt: An llm-empowered automatic penetra- tion testing tool.arXiv preprint arXiv:2308.06782, 2023

Gelei Deng, Yi Liu, Víctor Mayoral-Vilches, Peng Liu, Yuekang Li, Yuan Xu, Tianwei Zhang, Yang Liu, Martin Pinzger, and Stefan Rass. Pentestgpt: An llm-empowered automatic penetra- tion testing tool.arXiv preprint arXiv:2308.06782, 2023

work page arXiv 2023
[11]

Memory injection attacks on llm agents via query-only interaction.arXiv preprint arXiv:2503.03704, 2025

Shen Dong, Shaochen Xu, Pengfei He, Yige Li, Jiliang Tang, Tianming Liu, Hui Liu, and Zhen Xiang. Memory injection attacks on llm agents via query-only interaction.arXiv preprint arXiv:2503.03704, 2025

work page arXiv 2025
[12]

AgentLeak : A full-stack benchmark for privacy leakage in multi-agent LLM systems

Faouzi El Yagoubi, Ranwa Al Mallah, and Godwin Badu-Marfo. Agentleak: A full-stack benchmark for privacy leakage in multi-agent llm systems.arXiv preprint arXiv:2602.11510, 2026

work page arXiv 2026
[13]

Ci-work: Benchmarking contextual integrity in enterprise llm agents

Wenjie Fu, Xiaoting Qin, Jue Zhang, Qingwei Lin, Lukas Wutschitz, Robert Sim, Saravan Rajmohan, and Dongmei Zhang. Ci-work: Benchmarking contextual integrity in enterprise llm agents. InProceedings of the 64th Annual Meeting of the Association for Computational Linguistics: Industry Track, 2026

work page 2026
[14]

A safety and security framework for real-world agentic systems.arXiv preprint arXiv:2511.21990, 2025

Shaona Ghosh, Barnaby Simkin, Kyriacos Shiarlis, Soumili Nandi, Dan Zhao, Matthew Fiedler, Julia Bazinska, Nikki Pope, Roopa Prabhu, Daniel Rohrer, Michael Demoret, and Bartley Richardson. A safety and security framework for real-world agentic systems.arXiv preprint arXiv:2511.21990, 2025

work page arXiv 2025
[15]

Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injection

Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injection. InProceedings of the 16th ACM Workshop on Artificial Intelligence and Security, 2023. 10

work page 2023
[16]

When benign inputs lead to severe harms: Eliciting unsafe unintended behaviors of computer-use agents.arXiv preprint arXiv:2602.08235, 2026

Jaylen Jones, Zhehao Zhang, Yuting Ning, Eric Fosler-Lussier, Pierre-Luc St-Charles, Yoshua Bengio, Dawn Song, Yu Su, and Huan Sun. When benign inputs lead to severe harms: Eliciting unsafe unintended behaviors of computer-use agents.arXiv preprint arXiv:2602.08235, 2026

work page arXiv 2026
[17]

Butler W. Lampson. A note on the confinement problem.Communications of the ACM, 16(10): 613–615, 1973

work page 1973
[18]

Ibgp: Imperfect byzantine generals problem for zero-shot robustness in communicative multi-agent systems

Yihuan Mao, Yipeng Kang, Peilun Li, Wei Xu, and Chongjie Zhang. Ibgp: Imperfect byzantine generals problem for zero-shot robustness in communicative multi-agent systems. InArtificial General Intelligence, volume 16057 ofLecture Notes in Computer Science. Springer, 2025

work page 2025
[19]

Gaia: a benchmark for general ai assistants

Grégoire Mialon, Clémentine Fourrier, Thomas Wolf, Yann LeCun, and Thomas Scialom. Gaia: a benchmark for general ai assistants. InInternational Conference on Learning Representations, 2024

work page 2024
[20]

Helpful agent meets deceptive judge: Understanding vulnerabilities in agentic workflows.arXiv preprint arXiv:2506.03332, 2025

Yifei Ming, Zixuan Ke, Xuan-Phi Nguyen, Jiayu Wang, and Shafiq Joty. Helpful agent meets deceptive judge: Understanding vulnerabilities in agentic workflows.arXiv preprint arXiv:2506.03332, 2025

work page arXiv 2025
[21]

Mitre atlas: Adversarial threat landscape for ai systems, 2025

MITRE. Mitre atlas: Adversarial threat landscape for ai systems, 2025

work page 2025
[22]

Ngong, Keerthiram Murugesan, Swanand Kadhe, Justin D

Ivoline C. Ngong, Keerthiram Murugesan, Swanand Kadhe, Justin D. Weisz, Amit Dhurandhar, and Karthikeyan Natesan Ramamurthy. Agentscope: Evaluating contextual privacy across agentic workflows.arXiv preprint arXiv:2603.04902, 2026

work page arXiv 2026
[23]

Training language models to follow instructions with human feedback

Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. Training language models to follow instructions with human feedback. InAdvances in Neural Information Processing Systems, volume 35, 2022

work page 2022
[24]

Owasp agentic security guidance, 2025

OW ASP Foundation. Owasp agentic security guidance, 2025

work page 2025
[25]

Identifying the risks of lm agents with an lm-emulated sandbox

Yangjun Ruan, Honghua Dong, Andrew Wang, Silviu Pitis, Yongchao Zhou, Jimmy Ba, Yann Dubois, Chris J Maddison, and Tatsunori Hashimoto. Identifying the risks of lm agents with an lm-emulated sandbox. InThe Twelfth International Conference on Learning Representations, 2024

work page 2024
[26]

Andrei Sabelfeld and Andrew C. Myers. Language-based information-flow security.IEEE Journal on Selected Areas in Communications, 21(1):5–19, 2003

work page 2003
[27]

Saltzer and Michael D

Jerome H. Saltzer and Michael D. Schroeder. The protection of information in computer systems. Proceedings of the IEEE, 63(9):1278–1308, 1975

work page 1975
[28]

Agents of chaos, 2026

Natalie Shapira, Chris Wendler, Avery Yen, Gabriele Sarti, Koyena Pal, Olivia Floody, Adam Belfki, Alex Loftus, Aditya Ratan Jannali, Nikhil Prakash, Jasmine Cui, Giordano Rogers, Jannik Brinkmann, Can Rager, Amir Zur, Michael Ripa, Aruna Sankaranarayanan, David Atkinson, Rohit Gandikota, Jaden Fiotto-Kaufman, EunJeong Hwang, Hadas Orgad, P Sam Sahil, Neg...

work page arXiv 2026
[29]

Safearena: Evaluating the safety of autonomous web agents

Ada Defne Tur, Nicholas Meade, Xing Han Lù, Alejandra Zambrano, Arkil Patel, Esin Durmus, Spandana Gella, Karolina Sta ´nczak, and Siva Reddy. Safearena: Evaluating the safety of autonomous web agents. InProceedings of the 42nd International Conference on Machine Learning, 2025

work page 2025
[30]

Osworld: Benchmarking multimodal agents for open-ended tasks in real computer environments

Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh Jing Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, Yitao Liu, Yiheng Xu, Shuyan Zhou, Silvio Savarese, Caiming Xiong, Victor Zhong, and Tao Yu. Osworld: Benchmarking multimodal agents for open-ended tasks in real computer environments. InAdvances in Neural Information P...

work page 2024
[31]

Shi et al

Xianglin Yang, Yufei He, Shuo Ji, Bryan Hooi, and Jin Song Dong. Zombie agents: Per- sistent control of self-evolving llm agents via self-reinforcing injections.arXiv preprint arXiv:2602.15654, 2026

work page arXiv 2026
[32]

Assistantbench: Can web agents solve realistic and time-consuming tasks? In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2024

Ori Yoran, Samuel Joseph Amouyal, Chaitanya Malaviya, Ben Bogin, Ofir Press, and Jonathan Berant. Assistantbench: Can web agents solve realistic and time-consuming tasks? In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2024

work page 2024
[33]

Injecagent: Benchmarking indirect prompt injections in tool-integrated large language model agents

Qiusi Zhan, Zhixiang Liang, Zifan Ying, and Daniel Kang. Injecagent: Benchmarking indirect prompt injections in tool-integrated large language model agents. InFindings of Proceedings of the 62th Annual Meeting of the Association for Computational Linguistics, 2024

work page 2024
[34]

Agent security bench (asb): Formalizing and benchmarking attacks and defenses in llm-based agents

Hanrong Zhang, Jingyuan Huang, Kai Mei, Yifei Yao, Zhenting Wang, Chenlu Zhan, Hongwei Wang, and Yongfeng Zhang. Agent security bench (asb): Formalizing and benchmarking attacks and defenses in llm-based agents. InThe Thirteenth International Conference on Learning Representations, 2025

work page 2025
[35]

arXiv preprint arXiv:2503.09780 , year=

Arman Zharmagambetov, Chuan Guo, Ivan Evtimov, Maya Pavlova, Ruslan Salakhutdinov, and Kamalika Chaudhuri. Agentdam: Privacy leakage evaluation for autonomous web agents.arXiv preprint arXiv:2503.09780, 2025

work page arXiv 2025
[36]

Re- thinking the reliability of multi-agent system: A perspective from byzantine fault tolerance

Lifan Zheng, Jiawei Chen, Qinghong Yin, Jingyuan Zhang, Xinyi Zeng, and Yu Tian. Re- thinking the reliability of multi-agent system: A perspective from byzantine fault tolerance. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 40, 2026

work page 2026
[37]

WebArena: A Realistic Web Environment for Building Autonomous Agents

Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Yonatan Bisk, Daniel Fried, Uri Alon, and Graham Neubig. Webarena: A realistic web environment for building autonomous agents.arXiv preprint arXiv:2307.13854, 2024. 12

work page internal anchor Pith review Pith/arXiv arXiv 2024