Recognition: unknown
A pragmatic approach to regulating AI agents
Pith reviewed 2026-05-10 09:57 UTC · model grok-4.3
The pith
AI agents require regulation as distinct AI systems under the EU AI Act due to their autonomous cross-system actions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The unique capacity of agents to autonomously reason, plan, and execute tasks across disparate external systems necessitates a fundamental shift in oversight toward the orchestration layer, where multi-agent interactions introduce novel risks of misalignment. While agents generally utilise general-purpose AI models, their structural complexity and cross-system permeability require them to be regulated as AI systems with distinct obligations under the AI Act. Contract law should therefore adopt a traffic light system of staggered task authorization based on operational risk together with a statutory list of non-delegable legal acts.
What carries the argument
The orchestration layer as the primary site for overseeing multi-agent interactions, paired with a risk-tiered traffic light system for task authorization and a statutory list of non-delegable legal acts.
Where Pith is reading between the lines
- Regulators outside the EU might copy the orchestration-layer focus when writing rules for agentic systems.
- Developers could add monitoring hooks at the point where agents coordinate to simplify compliance.
- Controlled trials of multi-agent deployments could check whether orchestration-level rules actually lower misalignment events.
- The approach might lead to standard contract templates that flag which actions remain human-only.
Load-bearing premise
The premise that agents built on general-purpose models still create unique risks through their structural complexity and cross-system operations that cannot be managed by regulating the models alone, and that existing contract law can absorb a statutory list of non-delegable acts without major revision.
What would settle it
A documented case in which agents operating across multiple systems under only general-purpose model rules produce no misalignment, or a judicial ruling that EU contract law cannot accommodate a statutory list of non-delegable acts.
Figures
read the original abstract
The current advancement in and deployment of agentic AI systems has created a set of key challenges for the legal frameworks that govern their use. We cover two central components: first, the regulatory classification of agents under the EU AI Act, and second, the legal status and validity of autonomous actions within the established framework of EU contract law. We argue that the unique capacity of agents to autonomously reason, plan, and execute tasks across disparate external systems necessitates a fundamental shift in oversight toward the orchestration layer, where multi-agent interactions introduce novel risks of misalignment. While agents generally utilise general-purpose AI models, we posit that their structural complexity and cross-system permeability require them to be regulated as "AI systems" with distinct obligations under the AI Act. Consequently, our proposals highlight the need for robust accountability mechanisms to manage this heightened autonomy. On the contractual side, we advocate for a "traffic light" system of staggered task authorization based on operational risk and the creation of a statutory list of non-delegable legal acts. By implementing these measures, we provide a pragmatic pathway to ensure that the increasing autonomy of AI agents remains firmly anchored in human accountability and existing legal standards
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper argues that AI agents' autonomous reasoning, planning, and cross-system execution create novel risks at the orchestration layer of multi-agent systems, requiring them to be classified and regulated as distinct 'AI systems' under the EU AI Act with tailored obligations, even when built on general-purpose models. It further proposes, within EU contract law, a traffic-light system of staggered task authorization keyed to operational risk together with a statutory list of non-delegable legal acts to preserve human accountability.
Significance. If the classification and compatibility arguments are substantiated, the manuscript offers a concrete policy bridge between technical agent architectures and the EU AI Act plus contract-law instruments, potentially informing how regulators allocate obligations between model providers and agent deployers.
major comments (2)
- [Regulatory classification section (around the paragraph beginning 'While agents generally utilise general-purpose AI...)] The central claim that agents' 'structural complexity and cross-system permeability' necessitate distinct 'AI system' obligations beyond GPAI rules is load-bearing yet unsupported by explicit mapping. The manuscript should cite the precise definitions in Article 3 and the obligations in Articles 13 and Chapter V of the AI Act and demonstrate why orchestration-layer interactions fall outside those provisions.
- [Contract-law proposals section (the paragraphs advocating the 'traffic light' system and statutory list)] The proposal for a statutory list of non-delegable acts and a traffic-light authorization scheme assumes compatibility with existing principles of contractual autonomy and electronic agency under the e-Commerce Directive and national laws, but provides no analysis of potential conflicts with liability allocation or representation rules; this gap directly affects the feasibility of the contract-law recommendations.
minor comments (2)
- [Abstract and introduction] The abstract and introduction would benefit from a short table or bullet list contrasting the proposed obligations with current GPAI transparency and risk-management requirements to improve readability.
- [Throughout] Several sentences contain repetitive phrasing (e.g., repeated use of 'autonomously reason, plan, and execute'); tightening would strengthen the legal argumentation.
Simulated Author's Rebuttal
We thank the referee for the constructive comments and address each major point below, indicating the revisions we will incorporate.
read point-by-point responses
-
Referee: The central claim that agents' 'structural complexity and cross-system permeability' necessitate distinct 'AI system' obligations beyond GPAI rules is load-bearing yet unsupported by explicit mapping. The manuscript should cite the precise definitions in Article 3 and the obligations in Articles 13 and Chapter V of the AI Act and demonstrate why orchestration-layer interactions fall outside those provisions.
Authors: We accept that an explicit mapping to the AI Act provisions would strengthen the argument. In the revised manuscript we will add a dedicated subsection quoting Article 3(1) (definition of 'AI system') and Article 3(63) (definition of 'general-purpose AI model'), then map the orchestration layer's autonomous planning and cross-system execution to the obligations in Article 13 and Chapter V. We will show that these features generate emergent misalignment risks and systemic interactions that fall outside the model-provider-focused transparency and risk-management duties applicable to GPAI, thereby justifying distinct classification and obligations at the orchestration level. revision: yes
-
Referee: The proposal for a statutory list of non-delegable acts and a traffic-light authorization scheme assumes compatibility with existing principles of contractual autonomy and electronic agency under the e-Commerce Directive and national laws, but provides no analysis of potential conflicts with liability allocation or representation rules; this gap directly affects the feasibility of the contract-law recommendations.
Authors: We agree that compatibility analysis is required. The revised version will include a new paragraph examining the e-Commerce Directive (particularly electronic contracting and agency provisions) and national representation rules. We will argue that the traffic-light scheme preserves contractual autonomy by conditioning high-risk authorizations on human approval, thereby maintaining existing liability allocation to the human principal rather than shifting it to the agent. The statutory list of non-delegable acts will be presented as consistent with capacity and ratification requirements, avoiding conflicts by treating agent outputs as conditionally authorized representations subject to human oversight. revision: yes
Circularity Check
No circularity: policy arguments grounded in external EU legal frameworks
full rationale
The paper is a normative legal analysis proposing regulatory classifications and contractual mechanisms under the EU AI Act and contract law. Its claims rest on interpretive application of external statutes (e.g., definitions of AI systems, obligations in Articles 3/13/Chapter V, e-Commerce Directive) rather than any self-referential equations, fitted parameters renamed as predictions, or self-citation chains that reduce the central thesis to its own inputs. No load-bearing steps exhibit self-definition, ansatz smuggling, or renaming of known results; the derivation is self-contained against independent legal benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption EU AI Act framework applies to agentic systems and can be extended with distinct obligations for orchestration layers
- domain assumption Existing EU contract law can incorporate a traffic light authorization system and statutory list of non-delegable acts
Reference graph
Works this paper leans on
-
[1]
and Sugimoto, K
Aranda, L. and Sugimoto, K. (2026) The agentic AI landscape and its conceptual foundations. OECD Artificial Intelligence Papers, No
2026
-
[2]
Paris: OECD Publishing. Belcak, P., Heinrich, G., Diao, S., Fu, Y., Dong, X., Muralidharan, S., ... & Molchanov, P. (2025). Small language models are the future of agentic AI. arXiv preprint arXiv:2506.02153. Bellia Jr, A. J. (2001). Contracting with electronic agents. Emory LJ, 50,
-
[3]
Belova, M., Kansal, Y., Liang, Y., Xiao, J., & Jha, N. K. (2026). An Alternative Trajectory for Generative AI. arXiv preprint arXiv:2603.14147. Bengio, Y. et al. (2026). International AI Safety Report 2026: Navigating Rapid AI Advancement and Emerging Risks. [online] Department for Science, Innovation and Technology (UK). Available at: https://www.gov.uk/...
-
[4]
On the Opportunities and Risks of Foundation Models
Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M. S., Bohg, J., Bosselut, A., & Brunskill, E. (2021). On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258. Bundesamt für Sicherheit in der Informationstechnik. (2025). Generative KI-Modelle: Chancen und Risiken für Industrie und Behörd...
work page internal anchor Pith review arXiv 2021
-
[5]
arXiv preprint arXiv:2505.19591 , year=
Multi- agent collaboration via evolving orchestration. arXiv preprint arXiv:2505.19591. De Bruyne, J., Dheu, O., & Ducuing, C. (2023). The European Commission's approach to extra-contractual liability and AI–An evaluation of the AI liability directive and the revised product liability directive. Computer Law & Security Review, 51, 105894. Deepmind: The Et...
-
[6]
TRAIL: Trace reasoning and agentic issue localization.arXiv preprint arXiv:2505.08638, 2025
TRAIL: Trace Reasoning and Agentic Issue Localization. arXiv 2505.08638. Available at: https://arxiv.org/abs/2505.08638 Ding, D., Mallick, A., Wang, C., Sim, R., Mukherjee, S., Ruhle, V., ... & Awadallah, A. H. (2024). Hybrid LLM: Cost- efficient and quality-aware query routing. ICLR
-
[7]
Dziemian, M., Lin, M., Fu, X., Nowak, M., Winter, N., Jones, E., ... & Kolter, Z. (2026). How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition. arXiv preprint arXiv:2603.15714. Ebers, M., & Penagos, E. V. (2026). Upstream, downstream, and in between: navigating the GPAI value chain under EU law. Inform...
-
[8]
February, DOI:10.13140/RG.2.2.11455.42400
Explainable AI in Multi-Agent Systems: Advancing Transparency with Layered Prompting. February, DOI:10.13140/RG.2.2.11455.42400. Fei Yu, Hongbo Zhang, Prayag Tiwari, and Benyou Wang (2024). Natural language reasoning, a survey. Comput. Surveys 56, 12: 1–39. Felin, T. and Holweg, M. (2024). Theory Is All You Need: AI, Human Cognition, and Decision Making. ...
-
[9]
Gartner Predicts 40% of Enterprise Apps Will Feature Task-Specific AI Agents by 2026, Up from Less Than 5% in
Gartner (2025). Gartner Predicts 40% of Enterprise Apps Will Feature Task-Specific AI Agents by 2026, Up from Less Than 5% in
2025
-
[10]
[online] Gartner Newsroom. Available at: https://www.gartner.com/en/newsroom/press-releases/2025-08-26-gartner-predicts-40-percent-of- enterprise-apps-will-feature-task-specific-ai-agents-by-2026-up-from-less-than-5-percent-in-2025 Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. Gössl, S. (2024). Art
2025
-
[11]
Martini and C
In M. Martini and C. Wendehorst (Eds.), KI-VO: Verordnung über Künstliche Intelligenz: Kommentar. Beck. Grundmann, S., & Hacker, P. (2017). Digital technology as a challenge to European contract law: from the existing to the future architecture. European Review of Contract Law, 13(3), 255-293. Gutierrez, C. I., Aguirre, A., Uuk, R., Boine, C. C., & Frankl...
2017
-
[12]
arXiv preprint arXiv:2310.19852 , year=
Hacker, P. (2020). Datenprivatrecht. Mohr Siebeck. Hacker, P. (2023). The European AI liability directives–Critique of a half-hearted approach and lessons for the future. Computer Law & Security Review, 51, 105871. Hacker, P. (2024). Comments on the final trilogue version of the AI Act. Available at SSRN 4757603. Hacker, P., & Ebert, K. (2025). Attributin...
-
[13]
2025.Char- acterizing AI Agents for Alignment and Gover- nance
Characterizing AI agents for alignment and governance. arXiv preprint arXiv:2504.21848. Kidd Jr, D. L., & Daughtrey Jr, W. H. (1999). Adapting contract law to accommodate electronic contracts: overview and suggestions. Rutgers Computer & Tech. LJ, 26,
-
[14]
Kolt, N. (2025). Governing AI agents. arXiv preprint arXiv:2501.07913. Koorndijk, J. (2025). Empirical Evidence for Alignment Faking in a Small LLM and Prompt-Based Mitigation Techniques. arXiv. https://doi.org/10.48550/arxiv.2506.21584 Kötz, H. (2017). Agency and representation. In European contract law (2nd ed., pp. 293–318). Oxford University Press Lan...
-
[15]
LogPrompt: Prompt engineering towards zero-shot and interpretable log analysis,
Interpretable online log analysis using large language models with prompt strategies. arXiv preprint arXiv:2308.07610. Magesh, V., Surani, F., Dahl, M., Suzgun, M., Manning, C. D., & Ho, D. E. (2025). Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools. Journal of Empirical Legal Studies, 22(2), 216-242. Martini, M. (2026). Art
-
[16]
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
In M. Martini & C. Wendehorst (Eds.), KI-VO: Verordnung über Künstliche Intelligenz. Kommentar (2nd ed.). C. H. Beck. Migliorini, S. (2024). “More than words”: A legal approach to the risks of commercial chatbots powered by generative artificial intelligence. European Journal of Risk Regulation, 15, 719–736. Mirzadeh, I., Alizadeh, K., Aleaziz, H., Davies...
work page Pith review arXiv 2024
-
[17]
Beyond black-box benchmarking: Observability, analytics, and optimization of agentic systems
Beyond Black‑Box Benchmarking: Observability, Analytics, and Optimization of Agentic Systems. arXiv 2503.06745. Available at: https://arxiv.org/abs/2503.06745 35 Nannini, L., Smith, A. L., Maggini, M. J., Panai, E., Feliciano, S., Tiulkanov, A., ... & Bisconti, P. (2026). AI Agents Under EU Law. arXiv preprint arXiv:2604.04604. Ngo-Ho, A. K. N., Chauvin, ...
-
[18]
Novelli, C., Casolari, F., Hacker, P., Spedicato, G., & Floridi, L. (2024a). Generative AI in EU law: Liability, privacy, intellectual property, and cybersecurity. Computer Law & Security Review, 55, 106066. Novelli, C., Hacker, P., Morley, J., Trondal, J., & Floridi, L. (2024). A Robust Governance for the AI Act: AI Office, AI Board, Scientific Panel, an...
-
[19]
Retrieved from https://max-eup2012.mpipriv.de/index.php/E-Commerce Russell, S
Max Planck Institute for Comparative and International Private Law. Retrieved from https://max-eup2012.mpipriv.de/index.php/E-Commerce Russell, S. J., & Norvig, P. (2022). Artificial Intelligence: A Modern Approach (4th Global ed.). Pearson Education. Sage, N. (2023). Reconciling contract law’s objective and subjective standards. The Modern Law Review, 86...
2022
-
[20]
Schwartmann, R., & Zenner, K. (2025). GPAI Applications Under Scrutiny: The Regulation of the AI Regulation Along the Value Chain. Journal for European Data and Information Law, 1, 3-9 Sculley, D., Holt, G., Golovin, D., Davydov, E., Phillips, T., Ebner, D., Chaudhary, V., Young, M., Crespo, J.-F., & Dennison, D. (2015). Hidden technical debt in machine l...
2025
-
[21]
Large language model routing with benchmark datasets.arXiv preprint arXiv:2309.15789, 2023
Shnitzer, T., Ou, A., Silva, M., Soule, K., Sun, Y., Solomon, J., ... & Yurochkin, M. (2023). Large language model routing with benchmark datasets. arXiv preprint arXiv:2309.15789. Shojaee, M., Reddy, S. and Ghassemi, M. (2025). The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity. Ap...
-
[22]
Sorge, C. (2006). Softwareagenten: Vertragsschluss, Vertragsstrafe, Reugeld (1st ed.). KIT Scientific Publishing. St-Hilaire, I. (2025). Lying chatbot makes airline liable: Negligent misrepresentation in Moffatt v Air Canada. UBC Law Review, 58(2), 591-624. Sumers TR and others, ‘Cognitive Architectures for Language Agents’ (2024) 2024 Transactions on Mac...
2006
-
[23]
Veale, M., & Borgesius, F. Z. (2021). Demystifying the Draft EU Artificial Intelligence Act–Analysing the good, the bad, and the unclear elements of the proposed approach. Computer Law Review International, 22(4), 97-112. Veale, M., & Quintais, J. P. (2025). The Obligations of Providers of General-Purpose AI Models. Available at SSRN 5744602. Wachter, S. ...
2021
-
[24]
Wagner, G. (2023). Liability Rules for the Digital Age: –Aiming for the Brussels Effect–. Journal of European Tort Law, 13(3), 191-243. Wang, P., Li, X., Xiang, C., Zhang, J., Li, Y., Zhang, L., ... & Tian, Y. (2026). The Landscape of Prompt Injection Threats in LLM Agents: From Taxonomy to Analysis. arXiv preprint arXiv:2602.10453. Webb, T., Mondal, S. S...
-
[25]
Weitzenboeck, E. M. (2001). Electronic agents and the formation of contracts. International Journal of Law and Information Technology, 9(3), 204–234. Wendehorst, C. (2024). Principles of AI in Contracting, in Tim Sagaert and Wendy Vananroye (eds.), Privaatrecht plenis coloribus: Liber Amicorum Matthias Storme. Kluwer 2024, 1097 Wolpert, D.H. and Macready,...
-
[26]
An introduction to multiagent systems. John Wiley & Sons. Xu, Z., Jain, S., & Kankanhalli, M. (2024). Hallucination is inevitable: An innate limitation of large language models. arXiv preprint arXiv:2401.11817. Yang, C., Zhao, R., Liu, Y., & Jiang, L. (2025). Survey of specialized large language model. arXiv preprint arXiv:2508.19667. Yao, S., Zhao, J., Y...
-
[27]
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Tree of Thoughts: Deliberate Problem Solving with Large Language Models. arXiv preprint arXiv:2305.10601. Zaharia, M., Khattab, O., Chen, L., Davis, J. Q., Miller, H., Potts, C., Zou, J., Carbin, M., Frankle, J., Rao, N., & Ghodsi, A. (2024, February 18). The shift from models to compound AI systems. Berkeley Artificial Intelligence Research Blog. https:/...
work page internal anchor Pith review arXiv 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.