pith. machine review for the scientific record. sign in

arxiv: 2605.01611 · v1 · submitted 2026-05-02 · 💻 cs.CY · cs.AI· cs.LG

Recognition: unknown

The Case for ESM3 as a General-Purpose AI Model with Systemic Risk Under the EU AI Act

Jacob Griffith, Koen Holtman, Marcel Mir Teijeiro, Rokas Gipi\v{s}kis, Taro Qureshi, Ze Shen Chin

Authors on Pith no claims yet

Pith reviewed 2026-05-09 13:42 UTC · model grok-4.3

classification 💻 cs.CY cs.AIcs.LG
keywords esm3modelsbiologicalconcludegeneral-purposeobligationsrisksubject
0
0 comments X

The pith

ESM3 does not currently qualify as a general-purpose AI model with systemic risk under the EU AI Act despite mapping to biorisk chains, but regulatory remedies are proposed to address this gap.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The EU AI Act sets rules for powerful AI systems that could cause widespread harm. ESM3 is a large model trained on biological data to understand proteins and design molecules. The authors trace how ESM3 could be part of a chain leading to biological risks, such as helping create harmful agents. They compare ESM3's features like its scale and capabilities against the Act's specific thresholds for 'systemic risk' models. These thresholds focus on things like training compute or certain high-risk uses. The analysis finds that ESM3 falls short of triggering the obligations. The paper then suggests updates to the law so that providers of similar biological models must evaluate and reduce dual-use risks before release.

Core claim

We conclude that at this time, ESM3 does not appear to be meaningfully regulated by the Act. We then propose remedies to correct the situation.

Load-bearing premise

The assumption that the EU AI Act's current classification criteria and supporting material provide a complete and accurate basis for determining whether biological models like ESM3 pose systemic risks requiring obligations.

read the original abstract

Due to ambiguity in the wording of the EU AI Act, we examine the question of to what extent frontier biological foundation models such as ESM3 are subject to obligations for general-purpose AI models with systemic risk under the EU AI Act. In this paper, we map ESM3 to the biorisk chain, and conclude that it would be desirable if the providers of ESM3 and similar biological models were subject to these obligations, which would require them to assess and mitigate dual-use risks from their models. We then perform an analysis, comparing the attributes of ESM3 to the classification criteria in the AI Act and the supporting material. We conclude that at this time, ESM3 does not appear to be meaningfully regulated by the Act. We then propose remedies to correct the situation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper examines the extent to which the biological foundation model ESM3 is subject to obligations for general-purpose AI models with systemic risk under the EU AI Act. It maps ESM3 capabilities onto the biorisk chain, argues that subjecting such models to these obligations would be desirable for dual-use risk assessment and mitigation, compares ESM3 attributes against the Act's classification criteria and supporting material (including quantitative thresholds), concludes that ESM3 is not meaningfully regulated at present, and proposes remedies to address the identified gap.

Significance. If the legal mapping and attribute comparison hold, the paper identifies a concrete regulatory gap for advanced biological AI models under the EU AI Act, with potential implications for biosecurity governance. The biorisk-chain mapping and direct comparison to Act criteria provide a structured, falsifiable framework for evaluating similar models that could inform policy amendments or enforcement guidance. The work is strengthened by its explicit separation of the desirability argument from the regulatory-status conclusion.

major comments (2)
  1. [attribute comparison and classification criteria analysis] Analysis of classification criteria (post-biorisk mapping section): The conclusion that ESM3 fails to meet systemic-risk criteria rests on an interpretation that the Act (Art. 51, Annex XIII, and GPAI Code of Practice) relies exclusively on quantitative thresholds such as training compute or parameter count, with no binding qualitative route for dual-use biological design capabilities. However, the paper's own biorisk-chain mapping demonstrates high dual-use potential; the analysis does not cite or refute specific language in the supporting material that might permit capability-based triggers, leaving the 'not meaningfully regulated' claim load-bearing but incompletely tested against alternative readings of the Act.
  2. [remedies proposal] Remedies section: The proposed remedies to bring biological foundation models under the Act's obligations are outlined at a high level without specifying concrete mechanisms (e.g., amendments to Annex XIII or updates to the Code of Practice) or assessing their compatibility with the Act's existing quantitative framework, which weakens the actionability of the central policy recommendation.
minor comments (2)
  1. [abstract and introduction] The abstract and introduction could more explicitly distinguish the desirability argument from the regulatory-status finding to avoid any appearance of conflating policy preference with legal analysis.
  2. [throughout classification criteria section] Several citations to EU AI Act articles and annexes are referenced by number but lack pinpoint page or paragraph references in the supporting material, which would aid readers in verifying the attribute comparisons.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The analysis depends on interpretive assumptions about the EU AI Act's scope and criteria rather than mathematical derivations or empirical data fitting.

axioms (1)
  • domain assumption The EU AI Act's classification criteria for general-purpose AI models with systemic risk, as described in the Act and supporting material, are the appropriate and sufficient standard for assessing regulatory obligations for biological foundation models.
    Invoked when mapping ESM3 to the criteria and concluding it does not trigger obligations.

pith-pipeline@v0.9.0 · 5456 in / 1244 out tokens · 84444 ms · 2026-05-09T13:42:16.274082+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.