arxiv: 2604.18649 · v1 · submitted 2026-04-20 · 💻 cs.CR · cs.AI

Recognition: unknown

Position: No Retroactive Cure for Infringement during Training

Satoru Utsunomiya , Masaru Isonuma , Junichiro Mori , Ichiro Sakata

Authors on Pith no claims yet

Pith reviewed 2026-05-10 05:08 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords generative AImachine unlearningcopyright infringementdata lineageex-ante compliancepost-hoc mitigationunjust enrichmentunfair competition

0 comments

The pith

Post-hoc mitigation cannot retroactively cure liability for unlawful data acquisition and training in generative AI.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that techniques such as machine unlearning and inference-time guardrails fail to erase legal exposure when training began with unauthorized data. Compliance is determined by the original data lineage and the act of ingestion, not by the state of the final model or its outputs. This matters because developers increasingly treat post-training fixes as sufficient for compliance, yet the authors contend that copying during training can be a completed legal act preserved in the weights. They outline three supporting lines: the completed nature of unauthorized ingestion, independent restrictions from contracts and unfair-competition rules, and the possibility that remedies reach the model itself through disgorgement. The conclusion is that the field must move from post-hoc sanitization to verifiable ex-ante process compliance.

Core claim

Unauthorized copying or ingestion during training constitutes a legally complete act, and model weights function as fixed copies that retain training-derived expressive value; therefore post-hoc methods such as unlearning cannot retroactively establish compliance, contract and tort rules can restrict use independently of copyright defenses, and remedies including unjust enrichment may require stripping gains or reaching the model.

What carries the argument

Model weights as fixed copies that retain training-derived expressive value, making the timing of compliance checks hinge on data lineage rather than later outputs.

If this is right

Liability for infringement attaches at the moment of unauthorized ingestion and survives any subsequent filtering or unlearning.
Contractual licenses and unfair-competition principles can limit use even when copyright defenses such as fair use would otherwise apply.
Remedies for unjust enrichment may require disgorgement of gains traceable to protected inputs and, in some cases, restrictions on the model itself.
Development practices must prioritize verifiable lawful data sourcing and process documentation over reliance on post-training fixes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This position would encourage courts and regulators to treat the training phase as a distinct point of legal risk rather than focusing only on deployed outputs.
It could accelerate requirements for auditable data provenance records in large-scale model development.
The argument implies that mitigation efforts might still be useful for reducing ongoing harm but cannot serve as a complete defense to past acquisition violations.

Load-bearing premise

Model weights operate as fixed copies that retain training-derived expressive value, rendering later filtering beside the point for infringement.

What would settle it

A demonstration, whether technical or judicial, that complete removal of influence from specific unauthorized training data eliminates all legal liability attached to the original ingestion and training steps.

read the original abstract

As generative AI faces intensifying legal challenges, the machine learning community has increasingly relied on post-hoc mitigation -- especially machine unlearning and inference-time guardrails -- to argue for compliance. This paper argues that such post-hoc mitigation methods cannot retroactively cure liability from unlawful acquisition and training, because compliance hinges on data lineage, not the outputs. Our argument has three parts. First, unauthorized copying/ingestion can be a legally complete completed act, and model weights may operate as fixed copies that retain training-derived expressive value, making later filtering beside the point for infringement. Second, contract and tort/unfair-competition rules -- via licenses, terms of service, and anti-free-riding principles -- can independently restrict access and use, often bypassing copyright defenses (e.g., fair use or TDM exceptions). Third, since value from protected inputs can persist in weights, remedies such as unjust enrichment and disgorgement may require stripping gains and, in some cases, reaching the model itself. We therefore argue for a shift from Post-Hoc Sanitization to verifiable Ex-Ante Process Compliance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Post-training fixes cannot erase liability from unauthorized data use during training, per this position paper.

read the letter

The punchline from this paper is that no amount of machine unlearning or guardrails after training can retroactively make up for using data that was acquired or used without proper authorization. The authors insist that legal compliance depends on the data lineage from the start. What the paper does is assemble a three-part argument. First, the act of copying during training is complete, and the weights may count as a fixed copy that keeps some of the original expressive content. Second, licenses and terms of service can block standard copyright defenses like fair use. Third, remedies for unjust enrichment could require giving up gains tied to the protected material, possibly affecting the model. This synthesis is the main contribution, pulling existing legal ideas into one focused position against post-hoc solutions. It does well in emphasizing that the problem is not just about what the model outputs now but about how it was built. The writing is direct and avoids technical jargon that might confuse legal readers. The soft spots are mainly around the strength of the premises. The idea that weights retain training-derived expressive value is an assumption that needs more support from case law, as courts have not uniformly accepted this view for AI models. The paper also does not explore how this position would interact with ongoing fair use litigation in the US or similar exceptions elsewhere. Without specific examples or deeper analysis of remedies, some parts feel more like advocacy than fully developed legal reasoning. This paper is for readers who work at the intersection of AI development and intellectual property law. Engineers and compliance officers at generative AI firms would find the reminder about ex-ante processes useful, as would scholars tracking these lawsuits. It deserves a serious referee because the topic is current and the structure makes the argument easy to engage with. Even if the legal conclusions are open to debate, the paper shows honest engagement with the issues. I recommend sending it to peer review so that experts can comment on the interpretive claims and suggest additional authorities to consider.

Referee Report

2 major / 2 minor

Summary. The manuscript argues that post-hoc mitigation methods such as machine unlearning and inference-time guardrails cannot retroactively cure legal liability arising from unlawful acquisition and training of data for generative AI models. Compliance hinges on data lineage rather than outputs. The three-part argument is: (1) unauthorized copying/ingestion is a completed act, with model weights operating as fixed copies that retain training-derived expressive value, rendering later filtering irrelevant to infringement; (2) contract and tort/unfair-competition rules (licenses, terms of service, anti-free-riding) can independently restrict use and often bypass copyright defenses such as fair use or TDM exceptions; (3) since value from protected inputs can persist in weights, remedies including unjust enrichment and disgorgement may require stripping gains and, in some cases, reaching the model itself. The paper advocates shifting from post-hoc sanitization to verifiable ex-ante process compliance.

Significance. If the interpretive premises hold, the position would have substantial implications for AI development by prioritizing lawful data sourcing and ex-ante compliance over technical fixes after training. It offers a clear normative framework linking technical processes to legal doctrines on copying, contracts, and remedies, which could inform policy and research priorities at the ML-law intersection. The paper's strength is its logical structure without internal circularity or unsupported factual premises, drawing on external legal principles. As a position paper without empirical data or formal derivations, its significance depends on acceptance of the doctrinal claims rather than technical novelty.

major comments (2)

[First part of the argument] First part (unauthorized copying/ingestion): The claim that 'model weights may operate as fixed copies that retain training-derived expressive value' is load-bearing for the conclusion that post-hoc filtering is 'beside the point for infringement.' This interpretive premise about the nature of weights and persistence of expressive value requires additional grounding in technical literature on information encoding in neural networks or specific case precedents on derivative works and fixation in software, as contestability here directly affects the first pillar of the argument.
[Third part of the argument] Third part (remedies): The discussion of unjust enrichment, disgorgement, and potential reach to the model itself is central to arguing that value persists beyond outputs. This would be strengthened by citing concrete examples or doctrines where similar remedies have been applied to trained models or intangible assets derived from protected inputs, to support the claim that post-hoc changes cannot cure the underlying liability.

minor comments (2)

[Abstract] The abstract and introduction could clarify the jurisdictional scope (e.g., US copyright law focus) since doctrines like fair use and TDM exceptions vary significantly across jurisdictions, affecting the generality of the bypass claim.
[Overall structure] Consider adding a brief table or structured summary comparing post-hoc methods (unlearning, guardrails) against the three legal pillars to improve readability for a mixed technical-legal audience.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive review and for recognizing the paper's logical structure and potential implications for AI development and policy. We address each major comment below and will incorporate revisions to strengthen the manuscript as suggested.

read point-by-point responses

Referee: [First part of the argument] First part (unauthorized copying/ingestion): The claim that 'model weights may operate as fixed copies that retain training-derived expressive value' is load-bearing for the conclusion that post-hoc filtering is 'beside the point for infringement.' This interpretive premise about the nature of weights and persistence of expressive value requires additional grounding in technical literature on information encoding in neural networks or specific case precedents on derivative works and fixation in software, as contestability here directly affects the first pillar of the argument.

Authors: We agree that additional grounding would make the first pillar more robust. The manuscript, as a position paper, rests on copyright law's fixation doctrine, under which a work is fixed when embodied in a tangible medium permitting perception, reproduction, or communication. Model weights qualify as such a medium because they persistently encode and retain expressive elements from training data, enabling reproduction of similar outputs. To address the comment, we will revise the relevant section to cite technical literature on information encoding and retention in neural networks, including studies on memorization, data extraction attacks, and membership inference that demonstrate how training data influences and persists in weights. We will also reference legal precedents on derivative works and fixation as applied to software and databases. These additions will clarify the premise without altering the core argument that unauthorized ingestion is a completed act. revision: yes
Referee: [Third part of the argument] Third part (remedies): The discussion of unjust enrichment, disgorgement, and potential reach to the model itself is central to arguing that value persists beyond outputs. This would be strengthened by citing concrete examples or doctrines where similar remedies have been applied to trained models or intangible assets derived from protected inputs, to support the claim that post-hoc changes cannot cure the underlying liability.

Authors: We concur that concrete examples and doctrinal support would strengthen the remedies analysis. The paper invokes general principles of unjust enrichment and disgorgement, which require stripping benefits derived from unauthorized use of protected inputs, including in intellectual property contexts. To respond, we will revise the third part to include references to analogous doctrines and cases involving intangible assets, such as disgorgement of profits in trade secret misappropriation where derived products or knowledge are at issue, and copyright cases where remedies reach works incorporating protected expression. While direct precedents on trained generative models remain limited and emerging, these citations will better illustrate how value retention in weights can trigger remedies that post-hoc mitigation cannot retroactively eliminate. The revision will maintain the position paper's normative focus on ex-ante compliance. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper is a normative legal position paper with no equations, fitted parameters, technical derivations, or empirical predictions. Its three-part argument rests on external legal doctrines (copyright completion, contract/tort restrictions, and disgorgement remedies) and interpretive claims about model weights as fixed copies, all drawn from cited external principles rather than internal self-definitions or self-citation chains. No load-bearing step reduces by construction to the paper's own inputs; the central claim is self-contained against external legal benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard legal doctrines rather than new technical axioms or fitted parameters. No free parameters or invented entities are introduced.

axioms (2)

domain assumption Unauthorized copying or ingestion of protected material constitutes a completed legal act even if later outputs are filtered.
Invoked in the first part of the argument to treat training as an infringement event independent of downstream use.
domain assumption Contract and tort rules can restrict use independently of copyright defenses such as fair use.
Stated in the second part to argue that licenses and anti-free-riding principles survive copyright exceptions.

pith-pipeline@v0.9.0 · 5499 in / 1311 out tokens · 28942 ms · 2026-05-10T05:08:39.074053+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

20 extracted references · 5 canonical work pages · 3 internal anchors

[1]

Integrated cash management services, inc. v. digital transac- tions, inc. 920 F.2d 171 (2d Cir. 1990),

1990
[2]

Sega enterprises ltd. v. accolade, inc. 977 F.2d 1510 (9th Cir. 1992),

1992
[3]

Mai systems corp. v. peak computer, inc. 991 F.2d 511 (9th Cir. 1993),

1993
[4]

Procd, inc. v. zeidenberg. 86 F.3d 1447 (7th Cir. 1996),

1996
[5]

Tsubasa system co. ltd. v. toppan printing co. ltd. 1780 Hanrei Jiho 25 (Tokyo Dist. Ct. 2001),

2001
[6]

google, inc

Authors guild v. google, inc. 804 F.3d 202 (2d Cir. 2015),

2015
[7]

Case C-30/14 (CJEU 2015),

Ryanair ltd v pr aviation bv. Case C-30/14 (CJEU 2015),

2015
[8]

redigi inc

Capitol records, llc v. redigi inc. 910 F.3d 649 (2d Cir. 2018),

2018
[9]

Case C-762/19 (CJEU 2021),

Cv-online latvia v melons. Case C-762/19 (CJEU 2021),

2021
[10]

Getty images (us), inc. v. stability ai, ltd. No. 1:23-cv-00135 (D. Del. filed Feb. 3, 2023),

2023
[11]

The new york times co. v. microsoft corp. et al. No. 1:23- cv-11195 (S.D.N.Y . filed Dec. 27, 2023),

2023
[12]

Thomson reuters enter. ctr. gmbh v. ross intelligence inc. No. 1:20-cv-00613 (D. Del. Sept. 25, 2023),

2023
[13]

Order on fair use, bartz et al. v. anthropic pbc. No. C 24- 05417 WHA (N.D. Cal. June 23, 2025),

2025
[14]

Constitutional AI: Harmlessness from AI Feedback

Bai, Y ., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie, A., Mirhoseini, A., et al. Constitutional ai: Harmlessness from ai feedback.arXiv preprint arXiv:2212.08073,

work page internal anchor Pith review Pith/arXiv arXiv
[15]

On the Opportunities and Risks of Foundation Models

Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M. S., Bohg, J., Bosse- lut, A., Brunskill, E., Brynjolfsson, E., Buch, S., Card, D., Castellon, R., Chatterji, N., Chen, A., Creel, K., Davis, J. Q., Demszky, D., Donahue, C., Doumbouya, M., Durmus, E., Ermon, S., Etchemendy, J., Ethayarajh, K., Fei-Fei, L., Finn, C.,...

work page internal anchor Pith review arXiv
[16]

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-V oss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A....

1901
[17]

Who’s harry potter? approximate unlearning in llms, 2023.URL https://arxiv

Eldan, R. and Russinovich, M. Who’s harry pot- ter? approximate unlearning in llms.arXiv preprint arXiv:2310.02238,

work page arXiv
[18]

Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations

Inan, H., Upasani, K., Chi, J., Rungta, R., Iyer, K., Mao, Y ., Tontchev, M., Hu, Q., Fuller, B., Testug- gine, D., et al. Llama guard: Llm-based input-output safeguard for human-ai conversations.arXiv preprint arXiv:2312.06674,

work page internal anchor Pith review arXiv
[19]

Anderson

Shumailov, I., Shumaylov, Z., Zhao, Y ., Gal, Y ., Papernot, N., and Anderson, R. The curse of recursion: Training on generated data makes models forget.arXiv preprint arXiv:2305.17493,

work page arXiv
[20]

However, at the scale of large language models, this approach is often computationally prohibitive

can provide strong guarantees by retraining only the affected sub-models. However, at the scale of large language models, this approach is often computationally prohibitive. • Approximate unlearning:As a result, most practical work focuses on approximations, such as targetedgradient ascent updates that attempt to undo the learning signal by reducing the l...

2023