pith. machine review for the scientific record. sign in

arxiv: 2602.17753 · v2 · submitted 2026-02-19 · 💻 cs.CY · cs.AI

Recognition: 2 theorem links

· Lean Theorem

The 2025 AI Agent Index: Documenting Technical and Safety Features of Deployed Agentic AI Systems

Authors on Pith no claims yet

Pith reviewed 2026-05-15 20:39 UTC · model grok-4.3

classification 💻 cs.CY cs.AI
keywords AI agentstransparencysafety featuresagentic AIdocumentationAI ecosystemsocietal impactsevaluations
0
0 comments X

The pith

The 2025 AI Agent Index catalogs origins, capabilities, and safety details for 30 leading AI agents while showing uneven developer transparency.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper creates the 2025 AI Agent Index to record the origins, design, capabilities, ecosystem, and safety features of 30 state-of-the-art AI agents. It draws from publicly available sources and voluntary email responses from developers to build this record. A reader would care because agentic AI systems are advancing quickly with limited oversight, making systematic documentation useful for researchers and policymakers who need to track capabilities and risks. The index further shows that transparency levels differ across developers, with most providing little information on safety evaluations and societal impacts.

Core claim

The 2025 AI Agent Index compiles information on the origins, design, capabilities, ecosystem, and safety features of 30 state-of-the-art AI agents from publicly available information and email correspondence with developers. It reveals different transparency levels among agent developers and finds that most share little information about safety, evaluations, and societal impacts.

What carries the argument

The AI Agent Index, a structured documentation compiled from public data and voluntary developer responses that tracks technical features and safety posture across agents.

If this is right

  • Transparency on safety and evaluations varies widely among the 30 agents.
  • Most developers release minimal details about societal impacts and risk assessments.
  • The index provides a baseline reference for observing trends in agent capabilities and developer practices.
  • Policymakers gain a consolidated view of the current agent ecosystem from public sources.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The index could serve as a starting point for standardized disclosure requirements in future AI governance.
  • Developers may face pressure to increase transparency if the index becomes a widely used reference.
  • Gaps in the public record suggest value in supplementary verification methods beyond voluntary responses.

Load-bearing premise

Publicly available information together with voluntary developer responses gives an accurate and representative picture of each agent's actual capabilities and safety posture without major undisclosed details.

What would settle it

An independent audit or technical inspection of any one indexed agent that reveals substantial undisclosed safety vulnerabilities or capabilities would show the index's documentation to be incomplete.

Figures

Figures reproduced from arXiv: 2602.17753 by A. Pinar Ozisik, Kevin Feng, Kevin Wei, Leon Staufer, Luke Bailey, Mick Yang, Noam Kolt, Stephen Casper, Yawen Duan.

Figure 1
Figure 1. Figure 1: Interest in AI agents is growing. 2025 has seen a sharp increase in interest in AI agents. This is reflected in an increase of new Google search terms related to agentic AI products (blue bars) as well as Google Scholar paper counts for “AI agent” or “agentic AI” (red line). Accumulation of individual releases of agentic AI products included in this Index is shown by category: chats with agentic tools, ent… view at source ↗
Figure 2
Figure 2. Figure 2: Inclusion criteria for the Index. Candidate agents flow through three criteria categories from left to right. Systems [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: For 227 out of 1,350 fields, we were unable to find any information (gray). This is most common in the “Ecosystem [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Comparison between Chinese, US, and other agent developers. To mitigate potential blind spots, two native Chinese [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Distribution of levels of autonomy across each agent category, with the autonomy of three representative agents [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Most safety, evaluation, and social impact related fields (135/240) have no information available. Enterprise agents [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Control of parts of the AI agent ecosystem is fragmented, making reliable agent evaluation difficult. Individual [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The 2025 AI Agent Index with detailed annotations across 6 categories (45 columns) for 30 agentic AI products. [PITH_FULL_IMAGE:figures/full_fig_p023_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: First release of indexed agentic AI products over time and by agent category ( [PITH_FULL_IMAGE:figures/full_fig_p023_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: For 227 out of 1,350 fields, we were unable to find any information (gray). This is most common in the “Ecosystem [PITH_FULL_IMAGE:figures/full_fig_p024_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Number of new AI agentic product releases by month. [PITH_FULL_IMAGE:figures/full_fig_p025_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Number of information fields with “None found” by field category. [PITH_FULL_IMAGE:figures/full_fig_p025_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Support for choosing models and MCP varies by category of the agent. Enterprise agents are more likely to support [PITH_FULL_IMAGE:figures/full_fig_p026_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Changes in annotation fields across categories since the 2024 Index. We make significant changes across all sections compared to the 2024 Index [22]: 16 fields in the 2025 version are completely new, and only 9 fields are kept unaltered (Keep). The remaining 16 fields were derived from fields in the 2024 version either by making significant modifications to the definition/notes (Modify), splitting a field… view at source ↗
Figure 15
Figure 15. Figure 15: Custom annotation viewer to compare the human annotations to the LLM-generated ones in the verification phase. [PITH_FULL_IMAGE:figures/full_fig_p034_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Global search volume (log scale) for each AI agent product. Blue indicates inclusion in the Index. [PITH_FULL_IMAGE:figures/full_fig_p039_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Monthly search volume (log scale) for the most popular term for each AI agent product included in the Index over [PITH_FULL_IMAGE:figures/full_fig_p039_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Monthly search volume (log scale) for the 10 most popular AI agent products, colored by agent category ( [PITH_FULL_IMAGE:figures/full_fig_p040_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: GitHub stars for repositories associated with the agent products. [PITH_FULL_IMAGE:figures/full_fig_p040_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Details on the safety features and red teaming methodology for AI agents are limited. Screenshot of the “Model [PITH_FULL_IMAGE:figures/full_fig_p041_20.png] view at source ↗
read the original abstract

Agentic AI systems are increasingly capable of performing professional and personal tasks with limited human involvement. However, tracking these developments is difficult because the AI agent ecosystem is complex, rapidly evolving, and inconsistently documented, posing obstacles to both researchers and policymakers. To address these challenges, this paper presents the 2025 AI Agent Index. The Index documents information regarding the origins, design, capabilities, ecosystem, and safety features of 30 state-of-the-art AI agents based on publicly available information and email correspondence with developers. In addition to documenting information about individual agents, the Index illuminates broader trends in the development of agents, their capabilities, and the level of transparency of developers. Notably, we find different transparency levels among agent developers and observe that most developers share little information about safety, evaluations, and societal impacts. The 2025 AI Agent Index is available online at https://aiagentindex.mit.edu

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper presents the 2025 AI Agent Index, which documents the origins, design, capabilities, ecosystem, and safety features of 30 state-of-the-art AI agents using publicly available information supplemented by email correspondence with developers. It additionally reports broader trends, including varying transparency levels across developers and limited disclosure on safety evaluations and societal impacts, with the full Index hosted online at https://aiagentindex.mit.edu.

Significance. If the collected data accurately reflect available public and volunteered information, the Index supplies a timely baseline resource for researchers and policymakers tracking the AI agent ecosystem. The public online release and the descriptive observations on transparency gaps constitute concrete contributions that can inform future work on documentation standards and evaluation practices.

major comments (1)
  1. [Data Collection] Data Collection section: the manuscript provides insufficient detail on the number of developers contacted, response rates to correspondence, verification procedures for public claims, and the handling of incomplete or unverified entries. These omissions directly affect the reliability of the central observational claims about transparency levels and safety-feature coverage.
minor comments (2)
  1. [Abstract] Abstract: a single sentence summarizing the data-gathering approach (public sources plus voluntary correspondence) would better frame the reported transparency findings for readers.
  2. [Online Index] Index presentation: the online resource should explicitly tag each data field with its source type (public document vs. developer reply) to allow users to assess completeness independently.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their positive evaluation of the 2025 AI Agent Index as a timely baseline resource and for the recommendation of minor revision. We address the single major comment below and will incorporate the requested details to strengthen the manuscript.

read point-by-point responses
  1. Referee: Data Collection section: the manuscript provides insufficient detail on the number of developers contacted, response rates to correspondence, verification procedures for public claims, and the handling of incomplete or unverified entries. These omissions directly affect the reliability of the central observational claims about transparency levels and safety-feature coverage.

    Authors: We agree that the current description of data collection is insufficient for readers to fully assess the reliability of our transparency and safety-feature observations. In the revised manuscript we will expand the Data Collection section (or add a dedicated subsection) to report: the total number of developers contacted (both for the 30 indexed agents and any additional outreach), the response rate to email correspondence, the verification procedures applied to public claims (cross-referencing official documentation, multiple independent sources, and release notes), and the explicit protocol for incomplete or unverified entries (marking fields as “not publicly disclosed” and excluding them from aggregate statistics only when appropriate). These additions will be presented without changing the underlying data or conclusions. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper is a purely observational documentation project that compiles publicly available information and voluntary developer correspondence into an index of 30 AI agents. It contains no equations, derivations, fitted parameters, predictions, or mathematical claims of any kind. All content consists of descriptive summaries of external data sources, with no load-bearing steps that reduce to self-definition, self-citation chains, or renaming of results. The central observations about transparency levels are direct reports from the collected sources and do not rely on any internal circular logic.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a documentation and survey paper with no mathematical derivations, fitted parameters, or new theoretical entities.

pith-pipeline@v0.9.0 · 5486 in / 1000 out tokens · 39221 ms · 2026-05-15T20:39:15.770772+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 10 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. The Moltbook Observatory Archive: an incremental dataset of agent-only social network activity

    cs.SI 2026-04 unverdicted novelty 7.0

    The Moltbook Observatory Archive is the first large-scale dataset from a social network populated exclusively by autonomous AI agents, covering 78 days with 2.6 million posts and 1.2 million comments.

  2. Anumati: Proof of Adherence as a Formal Consent Model for Autonomous Agent Protocols

    cs.CR 2026-04 unverdicted novelty 7.0

    Anumati defines proof of adherence via versioned PolicyDocument, ConsentRecord, and AdherenceEvent primitives as a non-breaking extension to A2A and MCP protocols.

  3. AI Agents Under EU Law

    cs.CY 2026-04 unverdicted novelty 7.0

    AI agent providers face an exhaustive inventory requirement for actions and data flows, as high-risk systems with untraceable behavioral drift cannot meet the AI Act's essential requirements.

  4. HDP: A Lightweight Cryptographic Protocol for Human Delegation Provenance in Agentic AI Systems

    cs.CR 2026-04 unverdicted novelty 6.0

    HDP is a lightweight protocol that binds human authorization to sessions via signed append-only token chains, enabling offline verification of delegation provenance using only an Ed25519 public key and session identifier.

  5. Security Considerations for Multi-agent Systems

    cs.CR 2026-03 unverdicted novelty 6.0

    No existing AI security framework covers a majority of the 193 identified multi-agent system threats in any category, with OWASP Agentic Security Initiative achieving the highest overall coverage at 65.3%.

  6. Nautilus: From One Prompt to Plug-and-Play Robot Learning

    cs.RO 2026-05 unverdicted novelty 5.0

    NAUTILUS is a prompt-driven harness that automates plug-and-play adapters, typed contracts, and validation for policies, benchmarks, and robots in learning research.

  7. Agentic AI in the Software Development Lifecycle: Architecture, Empirical Evidence, and the Reshaping of Software Engineering

    cs.SE 2026-04 unverdicted novelty 5.0

    Agentic AI systems are shifting software engineering from line-level code generation to delegated repository-scale execution under supervision, with SWE-bench performance rising from 1.96% to 78.4% and productivity ga...

  8. Sovereign Agentic Loops: Decoupling AI Reasoning from Execution in Real-World Systems

    cs.CR 2026-04 unverdicted novelty 5.0

    Sovereign Agentic Loops decouple LLM reasoning from execution by emitting validated intents through a control plane with obfuscation and evidence chains, blocking 93% of unsafe actions in a cloud prototype while addin...

  9. Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems

    cs.SE 2026-04 unverdicted novelty 5.0

    Claude Code centers on a model-tool while-loop surrounded by permission systems, context compaction, extensibility hooks, subagent delegation, and session storage; the same design questions yield different answers in ...

  10. When the Agent Is the Adversary: Architectural Requirements for Agentic AI Containment After the April 2026 Frontier Model Escape

    cs.CR 2026-04 unverdicted novelty 3.0

    A reported 2026 frontier model escape shows that alignment training, sandboxing, tool interception, and audits fail against adversarial agentic AI, requiring five new architectural requirements for durable containment.

Reference graph

Works this paper leans on

156 extracted references · 156 canonical work pages · cited by 10 Pith papers · 5 internal anchors

  1. [1]

    Coalition for Content Provenance and Authenticity

    2025. Coalition for Content Provenance and Authenticity

  2. [2]

    Frontier AI Safety Commitments, AI Seoul Summit 2024

    2025. Frontier AI Safety Commitments, AI Seoul Summit 2024. https://www.gov.uk/government/publications/frontier-ai-safety- commitments-ai-seoul-summit-2024/frontier-ai-safety-commitments-ai-seoul-summit-2024

  3. [3]

    From Clawdbot to Moltbot to OpenClaw: Meet the AI agent generating buzz and fear globally.CNBC(2 Feb

    2026. From Clawdbot to Moltbot to OpenClaw: Meet the AI agent generating buzz and fear globally.CNBC(2 Feb. 2026). https: //www.cnbc.com/2026/02/02/openclaw-open-source-ai-agent-rise-controversy-clawdbot-moltbot-moltbook.html

  4. [4]

    AIAgentsList.com. 2025. AI Agents Directory 2025: 600+ AI Tools & Autonomous Agents. https://aiagentslist.com/

  5. [5]

    Amazon.com Services LLC. 2025. Amazon.com Services LLC v. Perplexity AI, Inc. Complaint filed in the U.S. District Court for the Western District of Washington. Case No. 3:25-cv-09514

  6. [6]

    Maksym Andriushchenko, Alexandra Souly, Mateusz Dziemian, Derek Duenas, Maxwell Lin, Justin Wang, Dan Hendrycks, Andy Zou, Zico Kolter, Matt Fredrikson, Eric Winsor, Jerome Wynne, Yarin Gal, and Xander Davies. 2025. AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents. InInternational Conference on Learning Representations

  7. [7]

    Samar Ansari. 2025. AI Slop and Data Pollution in the Age of Generative AI: Strategic Risks, Economic Consequences, and Governance Pathways for Business, Management, and the Creative Industries.Economic Consequences, and Governance Pathways for Business, Management, and the Creative Industries (October 23, 2025)(2025)

  8. [8]

    2025.Responsible Scaling Policy

    Anthropic. 2025.Responsible Scaling Policy. Technical Report

  9. [9]

    Matthew Arnold, Rachel KE Bellamy, Michael Hind, Stephanie Houde, Sameep Mehta, Aleksandra Mojsilović, Ravi Nair, K Natesan Ramamurthy, Alexandra Olteanu, David Piorkowski, et al . 2019. FactSheets: Increasing Trust in AI Services Through Supplier’s Declarations of Conformity.IBM Journal of Research and Development63, 4/5 (2019), 6–1

  10. [10]

    Ross Ashby

    W. Ross Ashby. 1956.An Introduction to Cybernetics. Chapman & Hall, London

  11. [11]

    Nicholas Barrow. 2024. Anthropomorphism and AI Hype.AI and Ethics4, 3 (Aug. 2024), 707–711. doi:10.1007/s43681-024-00454-1

  12. [12]

    Matthias Bastian. 2026. Malicious skills turn AI agent OpenClaw into a malware delivery system.The Decoder(8 Feb. 2026). https://the-decoder.com/malicious-skills-turn-ai-agent-openclaw-into-a-malware-delivery-system/

  13. [13]

    Julia Bazinska, Max Mathys, Francesco Casucci, Mateo Rojas-Carulla, Xander Davies, Alexandra Souly, and Niklas Pfister. 2025. Breaking Agent Backbones: Evaluating the Security of Backbone LLMs in AI Agents.arXiv preprint arXiv:2510.22620(2025)

  14. [14]

    Ivan Belcic and Cole Stryker. 2025. AI Agents in 2025: Expectations vs. Reality. IBM Think. https://www.ibm.com/think/insights/ai- agents-2025-expectations-vs-reality

  15. [15]

    Yoshua Bengio. 2023. AI and Catastrophic Risk.Journal of Democracy34, 4 (2023), 111–121

  16. [16]

    Yoshua Bengio, Sören Mindermann, Daniel Privitera, Tamay Besiroglu, Rishi Bommasani, Stephen Casper, Yejin Choi, Philip Fox, Ben Garfinkel, Danielle Goldfarb, Hoda Heidari, Anson Ho, Sayash Kapoor, Leila Khalatbari, Shayne Longpre, Sam Manning, Vasilios Mavroudis, Mantas Mazeika, Julian Michael, Jessica Newman, Kwan Yee Ng, Chinasa T. Okolo, Deborah Raji,...

  17. [17]

    Martin Beraja and Noam Yuchtman. 2025. Generalized Disruption: Society, Work, and Property Rights in the Age of AI. InNBER Chapters. National Bureau of Economic Research, Inc

  18. [18]

    Elettra Bietti. 2020. From Ethics Washing to Ethics Bashing: A View on Tech Ethics from Within Moral Philosophy. InProceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAT* ’20). Association for Computing Machinery, New York, NY, USA, 210–219. doi:10.1145/3351095.3372860

  19. [19]

    Rishi Bommasani, Kevin Klyman, Sayash Kapoor, Shayne Longpre, Betty Xiong, Nestor Maslej, and Percy Liang. 2024. The 2024 Foundation Model Transparency Index.arXiv preprint arXiv:2407.12929(2024)

  20. [20]

    Rishi Bommasani, Kevin Klyman, Shayne Longpre, Sayash Kapoor, Nestor Maslej, Betty Xiong, Daniel Zhang, and Percy Liang. 2023. The Foundation Model Transparency Index.arXiv preprint arXiv:2310.12941(2023)

  21. [21]

    Rishi Bommasani, Dilara Soylu, Thomas I Liao, Kathleen A Creel, and Percy Liang. 2024. Ecosystem Graphs: The Social Footprint of Foundation Models. InProceedings of the Seventh AAAI/ACM Conference on AI, Ethics, and Society

  22. [22]

    Stephen Casper, Luke Bailey, Rosco Hunter, Carson Ezell, Emma Cabalé, Michael Gerovitch, Stewart Slocum, Kevin Wei, Nikola Jurkovic, Ariba Khan, et al. 2025. The AI Agent Index.arXiv preprint arXiv:2502.01635(2025)

  23. [23]

    Artem Chaikin and Shivan Kaul Sahib. 2025. Agentic Browser Security: Indirect Prompt Injection in Perplexity Comet. https: //brave.com/blog/comet-prompt-injection/

  24. [24]

    Alan Chan, Carson Ezell, Max Kaufmann, Kevin Wei, Lewis Hammond, Herbie Bradley, Emma Bluemke, Nitarshan Rajkumar, David Krueger, Noam Kolt, et al. 2024. Visibility into AI Agents. InThe 2024 ACM Conference on Fairness, Accountability, and Transparency. 958–973

  25. [25]

    Alan Chan, Noam Kolt, Peter Wills, Usman Anwar, Christian Schroeder de Witt, Nitarshan Rajkumar, Lewis Hammond, David Krueger, Lennart Heim, and Markus Anderljung. 2024. IDs for AI Systems.arXiv preprint arXiv:2406.12137(2024)

  26. [26]

    Alan Chan, Rebecca Salganik, Alva Markelius, Chris Pang, Nitarshan Rajkumar, Dmitrii Krasheninnikov, Lauro Langosco, Zhonghao He, Yawen Duan, Micah Carroll, et al. 2023. Harms from Increasingly Agentic Algorithmic Systems. InProceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency. ACM, Chicago, IL, USA, 651–666. doi:10.1145/35...

  27. [27]

    Alan Chan, Kevin Wei, Sihao Huang, Nitarshan Rajkumar, Elija Perrier, Seth Lazar, Gillian K Hadfield, and Markus Anderljung. 2025. Infrastructure for AI Agents.Transactions on Machine Learning Research(2025). https://openreview.net/forum?id=Ckh17xN2R2

  28. [28]

    Artificial Intelligence Safety Commitments

    China Academy of Information and Communications Technology. 2024. First 17 Companies Sign Landmark “Artificial Intelligence Safety Commitments” Setting a New Standard for Industry Self-Regulation. https://mp.weixin.qq.com/s/s-XFKQCWhu0uye4opgb3Ng

  29. [29]

    Joshua Clymer, Nick Gabrieli, David Krueger, and Thomas Larsen. 2024. Safety Cases: How to Justify the Safety of Advanced AI Systems.arXiv preprint arXiv:2403.10462(2024)

  30. [30]

    Michael K Cohen, Noam Kolt, Yoshua Bengio, Gillian K Hadfield, and Stuart Russell. 2024. Regulating Advanced Artificial Agents. Science384, 6691 (2024), 36–38

  31. [31]

    Cesareo Contreras. 2026. Why the OpenClaw AI agent is a ‘privacy nightmare’.Northeastern Global News(10 Feb. 2026). https: //news.northeastern.edu/2026/02/10/open-claw-ai-assistant/

  32. [33]

    doi:10.1145/3531146.3533157

    A. Feder Cooper, Emanuel Moss, Benjamin Laufer, and Helen Nissenbaum. 2022. Accountability in an Algorithmic Society: Relationality, Responsibility, and Robustness in Machine Learning. InProceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’22). Association for Computing Machinery, New York, NY, USA, 864–876. doi:10....

  33. [34]

    Gabriel Corral, Vaibhav Singhal, Brian Mitchell, and Reid Tatoris. 2025. Perplexity Is Using Stealth, Undeclared Crawlers to Evade Website No-Crawl Directives. https://blog.cloudflare.com/perplexity-is-using-stealth-undeclared-crawlers-to-evade-website-no- crawl-directives/

  34. [35]

    Sumanth Dathathri, Abigail See, Sumedh Ghaisas, Po-Sen Huang, Rob McAdam, Johannes Welbl, Vandana Bachani, Alex Kaskasoli, Robert Stanforth, Tatiana Matejovicova, Jamie Hayes, Nidhi Vyas, Majd Al Merey, Jonah Brown-Cohen, Rudy Bunel, Borja Balle, Taylan Cemgil, Zahra Ahmed, Kitty Stacpoole, Ilia Shumailov, Ciprian Baetu, Sven Gowal, Demis Hassabis, and Pu...

  35. [36]

    2025.Technological Disruption in the Labor Market

    David J Deming, Christopher Ong, and Lawrence H Summers. 2025.Technological Disruption in the Labor Market. Technical Report. National Bureau of Economic Research. The 2025 AI Agent Index FAccT ’26, June 25–28, 2026, Montreal, QC, Canada

  36. [37]

    1989.The Intentional Stance

    Daniel C Dennett. 1989.The Intentional Stance. MIT Press

  37. [38]

    Ruchira Dhar, Danae Sanchez Villegas, Antonia Karamolegkou, Alice Schiavone, Yifei Yuan, Xinyi Chen, Jiaang Li, Stella Frank, Laura De Grazia, Monorama Swain, et al . 2025. EvalCards: A Framework for Standardized Evaluation Reporting.arXiv preprint arXiv:2511.21695(2025)

  38. [39]

    and NYP Holdings, Inc

    Dow Jones & Co., Inc. and NYP Holdings, Inc. 2024. Dow Jones & Co., Inc. v. Perplexity AI, Inc. Complaint filed in the U.S. District Court for the Southern District of New York. Case No. 1:24-cv-07984

  39. [40]

    Leonard Dung. 2024. Understanding Artificial Agency.The Philosophical Quarterly(2024), pqae010

  40. [41]

    Richard Fang, Rohan Bindu, Akul Gupta, Qiusi Zhan, and Daniel Kang. 2024. LLM Agents Can Autonomously Hack Websites.arXiv preprint arXiv:2402.06664(2024)

  41. [42]

    K. J. Kevin Feng, David W. McDonald, and Amy X. Zhang. 2025. Levels of Autonomy for AI Agents. doi:10.48550/arXiv.2506.12469

  42. [43]

    Luciano Floridi. 2019. Translating Principles into Practices of Digital Ethics: Five Risks of Being Unethical.Philosophy & Technology32, 2 (June 2019), 185–193. doi:10.1007/s13347-019-00354-x

  43. [44]

    Stan Franklin and Art Graesser. 1996. Is It an Agent, or Just a Program?: A Taxonomy for Autonomous Agents. InInternational Workshop on Agent Theories, Architectures, and Languages. Springer, 21–35

  44. [45]

    Frontier Model Forum. [n. d.]. Membership. https://www.frontiermodelforum.org/membership/

  45. [46]

    2025.AI Safety Index: Summer 2025

    Future of Life Institute. 2025.AI Safety Index: Summer 2025. Technical Report. Future of Life Institute. https://futureoflife.org/wp- content/uploads/2025/07/FLI-AI-Safety-Index-Report-Summer-2025.pdf

  46. [47]

    Iason Gabriel, Arianna Manzini, Geoff Keeling, Lisa Anne Hendricks, Verena Rieser, Hasan Iqbal, Nenad Tomašev, Ira Ktena, Zachary Kenton, Mikel Rodriguez, et al. 2024. The Ethics of Advanced AI Assistants.arXiv preprint arXiv:2404.16244(2024)

  47. [48]

    Salvatore Gariuolo, Vincenzo Ciancaglini, and Fernando Tucci. 2026. Viral AI, Invisible Risks: What OpenClaw Reveals About Agentic Assistants.Trend Micro Research(6 Feb. 2026). https://www.trendmicro.com/en_us/research/26/b/what-openclaw-reveals-about- agentic-assistants.html

  48. [49]

    Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford

  49. [50]

    InProceedings of the 5th Workshop on Fairness, Accountability, and Transparency in Machine Learning

    Datasheets for Datasets. InProceedings of the 5th Workshop on Fairness, Accountability, and Transparency in Machine Learning

  50. [51]

    Elizabeth Gibney. 2025. AI Bots Wrote and Reviewed All Papers at This Conference.Nature646, 8086 (2025), 786–786

  51. [52]

    Elizabeth Gibney. 2025. How AI Agents Will Change Research: A Scientist’s Guide.Nature(3 Oct. 2025). doi:10.1038/d41586-025-03246-7

  52. [53]

    Thomas Krendl Gilbert, Nathan Lambert, Sarah Dean, Tom Zick, and Aaron Snoswell. 2023. Reward Reports for Reinforcement Learning. InProceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society

  53. [54]

    Aviad Gispan. 2025. CometJacking: How One Click Can Turn Perplexity’s Comet AI Browser Against You. https://layerxsecurity.com/ blog/cometjacking-how-one-click-can-turn-perplexitys-comet-ai-browser-against-you/

  54. [55]

    Google Cloud. 2025. What Are AI Agents? Definition, Examples, and Types. https://cloud.google.com/discover/what-are-ai-agents

  55. [56]

    Juraj Gottweis, Wei-Hung Weng, Alexander Daryin, Tao Tu, Anil Palepu, Petar Sirkovic, Artiom Myaskovsky, Felix Weissenberger, Keran Rong, Ryutaro Tanno, et al. 2025. Towards an AI Co-Scientist.arXiv preprint arXiv:2502.18864(2025)

  56. [57]

    Mourad Gridach, Jay Nanavati, Khaldoun Zine El Abidine, Lenon Mendes, and Christina Mack. 2025. Agentic AI for Scientific Discovery: A Survey of Progress, Challenges, and Future Directions.arXiv preprint arXiv:2503.08979(2025)

  57. [58]

    Furkan Gursoy and Ioannis A Kakadiaris. 2022. System Cards for AI-Based Decision-Making for Public Policy.arXiv preprint arXiv:2203.04754(2022)

  58. [59]

    Gillian K Hadfield and Andrew Koh. 2025. An Economy of AI Agents. InThe Economics of Transformative AI. National Bureau of Economic Research. https://www.nber.org/system/files/chapters/c15305/c15305.pdf

  59. [60]

    Teresa Hammerschmidt, Katharina Stolz, and Oliver Posegga. 2025. Bridging the Gap: Inequalities That Divide Those Who Can and Cannot Create Sustainable Outcomes with AI.Behaviour & Information Technology(2025), 1–30

  60. [61]

    Dan Hendrycks, Mantas Mazeika, and Thomas Woodside. 2023. An Overview of Catastrophic AI Risks.arXiv preprint arXiv:2306.12001 (2023)

  61. [62]

    Johannes Himmelreich. 2019. Responsibility for Killer Robots.Ethical Theory and Moral Practice22, 3 (2019), 731–747

  62. [63]

    Nanna Inie, Stefania Druga, Peter Zukerman, and Emily M. Bender. 2024. From “AI” to Probabilistic Automation: How Does Anthropomorphization of Technical Systems Descriptions Influence Trust?. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’24). Association for Computing Machinery, New York, NY, USA, 2322–2347...

  63. [64]

    Aaron Jaech, Adam Kalai, Adam Lerer, Adam Richardson, Ahmed El-Kishky, Aiden Low, Alec Helyar, Aleksander Madry, Alex Beutel, Alex Carney, et al. 2024. OpenAI o1 System Card.arXiv preprint arXiv:2412.16720(2024)

  64. [65]

    Nicholas R Jennings. 2000. On Agent-Based Software Engineering.Artificial Intelligence117, 2 (2000), 277–296

  65. [66]

    Sayash Kapoor, Noam Kolt, and Seth Lazar. 2025. Position: Build Agent Advocates, Not Platform Agents. https://openreview.net/ forum?id=jd1N60VNFE

  66. [67]

    Sayash Kapoor, Benedikt Stroebl, Peter Kirgis, Nitya Nadgir, Zachary S Siegel, Boyi Wei, Tianci Xue, Ziru Chen, Felix Chen, Saiteja Utpala, et al. 2025. Holistic Agent Leaderboard: The Missing Infrastructure for AI Agent Evaluation.arXiv preprint arXiv:2510.11977 FAccT ’26, June 25–28, 2026, Montreal, QC, Canada Staufer et al. (2025)

  67. [68]

    Atoosa Kasirzadeh and Iason Gabriel. 2025. Characterizing AI Agents for Alignment and Governance.arXiv preprint arXiv:2504.21848 (April 2025). doi:10.48550/arXiv.2504.21848

  68. [69]

    Joshua Kazdan, Rylan Schaeffer, Apratim Dey, Matthias Gerstgrasser, Rafael Rafailov, David L Donoho, and Sanmi Koyejo. 2024. Collapse or Thrive? Perils and Promises of Synthetic Data in a Self-Generating World.arXiv preprint arXiv:2410.16713(2024)

  69. [70]

    Zachary Kenton, Ramana Kumar, Sebastian Farquhar, Jonathan Richens, Matt MacDermott, and Tom Everitt. 2023. Discovering Agents. Artificial Intelligence322 (2023), 103963

  70. [71]

    Noam Kolt. 2025. Governing AI Agents.Notre Dame Law Review(2025)

  71. [72]

    Noam Kolt, Nicholas Caputo, Jack Boeglin, Cullen O’Keefe, Rishi Bommasani, Stephen Casper, Mariano-Florentino Cuéllar, Noah Feldman, Iason Gabriel, Gillian K Hadfield, et al. 2026. Legal Alignment for Safe and Ethical AI.arXiv preprint arXiv:2601.04175(2026)

  72. [73]

    Tomek Korbak, Mikita Balesni, Buck Shlegeris, and Geoffrey Irving. 2025. How to Evaluate Control Measures for LLM Agents? A Trajectory from Today to Superintelligence.arXiv preprint arXiv:2504.05259(2025)

  73. [74]

    Dan M Kotliar. 2025. Can’t Stop the Hype: Scrutinizing AI’s Realities.Information, Communication & Society(2025), 1–22

  74. [75]

    Michael Kouremetis, Marissa Dotter, Alex Byrne, Dan Martin, Ethan Michalak, Gianpaolo Russo, Michael Threet, and Guido Zarrella

  75. [76]

    OCCULT: Evaluating Large Language Models for Offensive Cyber Operation Capabilities.arXiv preprint arXiv:2502.15797(2025)

  76. [77]

    Priyanshu Kumar, Elaine Lau, Saranya Vijayakumar, Tu Trinh, Scale Red Team, Elaine Chang, Vaughn Robinson, Sean Hendryx, Shuyan Zhou, Matt Fredrikson, et al. 2024. Refusal-Trained LLMs Are Easily Jailbroken as Browser Agents.arXiv preprint arXiv:2410.13886 (2024)

  77. [78]

    Weixin Liang, Nazneen Rajani, Xinyu Yang, Ezinwanne Ozoani, Eric Wu, Yiqun Chen, Daniel Scott Smith, and James Zou. 2024. Systematic Analysis of 32,111 AI Model Cards Characterizes Documentation Practice in AI.Nature Machine Intelligence6, 7 (July 2024), 744–753. doi:10.1038/s42256-024-00857-z

  78. [80]

    Shayne Longpre, Robert Mahari, Anthony Chen, Naana Obeng-Marnu, Damien Sileo, William Brannon, Niklas Muennighoff, Nathan Khazam, Jad Kabbara, Kartik Perisetla, Xinyi (Alexis) Wu, Enrico Shippole, Kurt Bollacker, Tongshuang Wu, Luis Villa, Sandy Pentland, and Sara Hooker. 2024. A Large-Scale Audit of Dataset Licensing and Attribution in AI.Nature Machine ...

  79. [81]

    1990.Designing Autonomous Agents: Theory and Practice from Biology to Engineering and Back

    Pattie Maes. 1990.Designing Autonomous Agents: Theory and Practice from Biology to Engineering and Back. MIT Press

  80. [82]

    Pattie Maes. 1993. Modeling Adaptive Autonomous Agents.Artificial Life1, 1_2 (1993), 135–162

Showing first 80 references.