pith. machine review for the scientific record. sign in

arxiv: 2604.18649 · v1 · submitted 2026-04-20 · 💻 cs.CR · cs.AI

Recognition: unknown

Position: No Retroactive Cure for Infringement during Training

Authors on Pith no claims yet

Pith reviewed 2026-05-10 05:08 UTC · model grok-4.3

classification 💻 cs.CR cs.AI
keywords generative AImachine unlearningcopyright infringementdata lineageex-ante compliancepost-hoc mitigationunjust enrichmentunfair competition
0
0 comments X

The pith

Post-hoc mitigation cannot retroactively cure liability for unlawful data acquisition and training in generative AI.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that techniques such as machine unlearning and inference-time guardrails fail to erase legal exposure when training began with unauthorized data. Compliance is determined by the original data lineage and the act of ingestion, not by the state of the final model or its outputs. This matters because developers increasingly treat post-training fixes as sufficient for compliance, yet the authors contend that copying during training can be a completed legal act preserved in the weights. They outline three supporting lines: the completed nature of unauthorized ingestion, independent restrictions from contracts and unfair-competition rules, and the possibility that remedies reach the model itself through disgorgement. The conclusion is that the field must move from post-hoc sanitization to verifiable ex-ante process compliance.

Core claim

Unauthorized copying or ingestion during training constitutes a legally complete act, and model weights function as fixed copies that retain training-derived expressive value; therefore post-hoc methods such as unlearning cannot retroactively establish compliance, contract and tort rules can restrict use independently of copyright defenses, and remedies including unjust enrichment may require stripping gains or reaching the model.

What carries the argument

Model weights as fixed copies that retain training-derived expressive value, making the timing of compliance checks hinge on data lineage rather than later outputs.

If this is right

  • Liability for infringement attaches at the moment of unauthorized ingestion and survives any subsequent filtering or unlearning.
  • Contractual licenses and unfair-competition principles can limit use even when copyright defenses such as fair use would otherwise apply.
  • Remedies for unjust enrichment may require disgorgement of gains traceable to protected inputs and, in some cases, restrictions on the model itself.
  • Development practices must prioritize verifiable lawful data sourcing and process documentation over reliance on post-training fixes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This position would encourage courts and regulators to treat the training phase as a distinct point of legal risk rather than focusing only on deployed outputs.
  • It could accelerate requirements for auditable data provenance records in large-scale model development.
  • The argument implies that mitigation efforts might still be useful for reducing ongoing harm but cannot serve as a complete defense to past acquisition violations.

Load-bearing premise

Model weights operate as fixed copies that retain training-derived expressive value, rendering later filtering beside the point for infringement.

What would settle it

A demonstration, whether technical or judicial, that complete removal of influence from specific unauthorized training data eliminates all legal liability attached to the original ingestion and training steps.

read the original abstract

As generative AI faces intensifying legal challenges, the machine learning community has increasingly relied on post-hoc mitigation -- especially machine unlearning and inference-time guardrails -- to argue for compliance. This paper argues that such post-hoc mitigation methods cannot retroactively cure liability from unlawful acquisition and training, because compliance hinges on data lineage, not the outputs. Our argument has three parts. First, unauthorized copying/ingestion can be a legally complete completed act, and model weights may operate as fixed copies that retain training-derived expressive value, making later filtering beside the point for infringement. Second, contract and tort/unfair-competition rules -- via licenses, terms of service, and anti-free-riding principles -- can independently restrict access and use, often bypassing copyright defenses (e.g., fair use or TDM exceptions). Third, since value from protected inputs can persist in weights, remedies such as unjust enrichment and disgorgement may require stripping gains and, in some cases, reaching the model itself. We therefore argue for a shift from Post-Hoc Sanitization to verifiable Ex-Ante Process Compliance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript argues that post-hoc mitigation methods such as machine unlearning and inference-time guardrails cannot retroactively cure legal liability arising from unlawful acquisition and training of data for generative AI models. Compliance hinges on data lineage rather than outputs. The three-part argument is: (1) unauthorized copying/ingestion is a completed act, with model weights operating as fixed copies that retain training-derived expressive value, rendering later filtering irrelevant to infringement; (2) contract and tort/unfair-competition rules (licenses, terms of service, anti-free-riding) can independently restrict use and often bypass copyright defenses such as fair use or TDM exceptions; (3) since value from protected inputs can persist in weights, remedies including unjust enrichment and disgorgement may require stripping gains and, in some cases, reaching the model itself. The paper advocates shifting from post-hoc sanitization to verifiable ex-ante process compliance.

Significance. If the interpretive premises hold, the position would have substantial implications for AI development by prioritizing lawful data sourcing and ex-ante compliance over technical fixes after training. It offers a clear normative framework linking technical processes to legal doctrines on copying, contracts, and remedies, which could inform policy and research priorities at the ML-law intersection. The paper's strength is its logical structure without internal circularity or unsupported factual premises, drawing on external legal principles. As a position paper without empirical data or formal derivations, its significance depends on acceptance of the doctrinal claims rather than technical novelty.

major comments (2)
  1. [First part of the argument] First part (unauthorized copying/ingestion): The claim that 'model weights may operate as fixed copies that retain training-derived expressive value' is load-bearing for the conclusion that post-hoc filtering is 'beside the point for infringement.' This interpretive premise about the nature of weights and persistence of expressive value requires additional grounding in technical literature on information encoding in neural networks or specific case precedents on derivative works and fixation in software, as contestability here directly affects the first pillar of the argument.
  2. [Third part of the argument] Third part (remedies): The discussion of unjust enrichment, disgorgement, and potential reach to the model itself is central to arguing that value persists beyond outputs. This would be strengthened by citing concrete examples or doctrines where similar remedies have been applied to trained models or intangible assets derived from protected inputs, to support the claim that post-hoc changes cannot cure the underlying liability.
minor comments (2)
  1. [Abstract] The abstract and introduction could clarify the jurisdictional scope (e.g., US copyright law focus) since doctrines like fair use and TDM exceptions vary significantly across jurisdictions, affecting the generality of the bypass claim.
  2. [Overall structure] Consider adding a brief table or structured summary comparing post-hoc methods (unlearning, guardrails) against the three legal pillars to improve readability for a mixed technical-legal audience.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive review and for recognizing the paper's logical structure and potential implications for AI development and policy. We address each major comment below and will incorporate revisions to strengthen the manuscript as suggested.

read point-by-point responses
  1. Referee: [First part of the argument] First part (unauthorized copying/ingestion): The claim that 'model weights may operate as fixed copies that retain training-derived expressive value' is load-bearing for the conclusion that post-hoc filtering is 'beside the point for infringement.' This interpretive premise about the nature of weights and persistence of expressive value requires additional grounding in technical literature on information encoding in neural networks or specific case precedents on derivative works and fixation in software, as contestability here directly affects the first pillar of the argument.

    Authors: We agree that additional grounding would make the first pillar more robust. The manuscript, as a position paper, rests on copyright law's fixation doctrine, under which a work is fixed when embodied in a tangible medium permitting perception, reproduction, or communication. Model weights qualify as such a medium because they persistently encode and retain expressive elements from training data, enabling reproduction of similar outputs. To address the comment, we will revise the relevant section to cite technical literature on information encoding and retention in neural networks, including studies on memorization, data extraction attacks, and membership inference that demonstrate how training data influences and persists in weights. We will also reference legal precedents on derivative works and fixation as applied to software and databases. These additions will clarify the premise without altering the core argument that unauthorized ingestion is a completed act. revision: yes

  2. Referee: [Third part of the argument] Third part (remedies): The discussion of unjust enrichment, disgorgement, and potential reach to the model itself is central to arguing that value persists beyond outputs. This would be strengthened by citing concrete examples or doctrines where similar remedies have been applied to trained models or intangible assets derived from protected inputs, to support the claim that post-hoc changes cannot cure the underlying liability.

    Authors: We concur that concrete examples and doctrinal support would strengthen the remedies analysis. The paper invokes general principles of unjust enrichment and disgorgement, which require stripping benefits derived from unauthorized use of protected inputs, including in intellectual property contexts. To respond, we will revise the third part to include references to analogous doctrines and cases involving intangible assets, such as disgorgement of profits in trade secret misappropriation where derived products or knowledge are at issue, and copyright cases where remedies reach works incorporating protected expression. While direct precedents on trained generative models remain limited and emerging, these citations will better illustrate how value retention in weights can trigger remedies that post-hoc mitigation cannot retroactively eliminate. The revision will maintain the position paper's normative focus on ex-ante compliance. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper is a normative legal position paper with no equations, fitted parameters, technical derivations, or empirical predictions. Its three-part argument rests on external legal doctrines (copyright completion, contract/tort restrictions, and disgorgement remedies) and interpretive claims about model weights as fixed copies, all drawn from cited external principles rather than internal self-definitions or self-citation chains. No load-bearing step reduces by construction to the paper's own inputs; the central claim is self-contained against external legal benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard legal doctrines rather than new technical axioms or fitted parameters. No free parameters or invented entities are introduced.

axioms (2)
  • domain assumption Unauthorized copying or ingestion of protected material constitutes a completed legal act even if later outputs are filtered.
    Invoked in the first part of the argument to treat training as an infringement event independent of downstream use.
  • domain assumption Contract and tort rules can restrict use independently of copyright defenses such as fair use.
    Stated in the second part to argue that licenses and anti-free-riding principles survive copyright exceptions.

pith-pipeline@v0.9.0 · 5499 in / 1311 out tokens · 28942 ms · 2026-05-10T05:08:39.074053+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

20 extracted references · 5 canonical work pages · 3 internal anchors

  1. [1]

    Integrated cash management services, inc. v. digital transac- tions, inc. 920 F.2d 171 (2d Cir. 1990),

  2. [2]

    Sega enterprises ltd. v. accolade, inc. 977 F.2d 1510 (9th Cir. 1992),

  3. [3]

    Mai systems corp. v. peak computer, inc. 991 F.2d 511 (9th Cir. 1993),

  4. [4]

    Procd, inc. v. zeidenberg. 86 F.3d 1447 (7th Cir. 1996),

  5. [5]

    Tsubasa system co. ltd. v. toppan printing co. ltd. 1780 Hanrei Jiho 25 (Tokyo Dist. Ct. 2001),

  6. [6]

    google, inc

    Authors guild v. google, inc. 804 F.3d 202 (2d Cir. 2015),

  7. [7]

    Case C-30/14 (CJEU 2015),

    Ryanair ltd v pr aviation bv. Case C-30/14 (CJEU 2015),

  8. [8]

    redigi inc

    Capitol records, llc v. redigi inc. 910 F.3d 649 (2d Cir. 2018),

  9. [9]

    Case C-762/19 (CJEU 2021),

    Cv-online latvia v melons. Case C-762/19 (CJEU 2021),

  10. [10]

    Getty images (us), inc. v. stability ai, ltd. No. 1:23-cv-00135 (D. Del. filed Feb. 3, 2023),

  11. [11]

    The new york times co. v. microsoft corp. et al. No. 1:23- cv-11195 (S.D.N.Y . filed Dec. 27, 2023),

  12. [12]

    Thomson reuters enter. ctr. gmbh v. ross intelligence inc. No. 1:20-cv-00613 (D. Del. Sept. 25, 2023),

  13. [13]

    Order on fair use, bartz et al. v. anthropic pbc. No. C 24- 05417 WHA (N.D. Cal. June 23, 2025),

  14. [14]

    Constitutional AI: Harmlessness from AI Feedback

    Bai, Y ., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie, A., Mirhoseini, A., et al. Constitutional ai: Harmlessness from ai feedback.arXiv preprint arXiv:2212.08073,

  15. [15]

    On the Opportunities and Risks of Foundation Models

    Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M. S., Bohg, J., Bosse- lut, A., Brunskill, E., Brynjolfsson, E., Buch, S., Card, D., Castellon, R., Chatterji, N., Chen, A., Creel, K., Davis, J. Q., Demszky, D., Donahue, C., Doumbouya, M., Durmus, E., Ermon, S., Etchemendy, J., Ethayarajh, K., Fei-Fei, L., Finn, C.,...

  16. [16]

    Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-V oss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A....

  17. [17]

    Who’s harry potter? approximate unlearning in llms, 2023.URL https://arxiv

    Eldan, R. and Russinovich, M. Who’s harry pot- ter? approximate unlearning in llms.arXiv preprint arXiv:2310.02238,

  18. [18]

    Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations

    Inan, H., Upasani, K., Chi, J., Rungta, R., Iyer, K., Mao, Y ., Tontchev, M., Hu, Q., Fuller, B., Testug- gine, D., et al. Llama guard: Llm-based input-output safeguard for human-ai conversations.arXiv preprint arXiv:2312.06674,

  19. [19]

    Anderson

    Shumailov, I., Shumaylov, Z., Zhao, Y ., Gal, Y ., Papernot, N., and Anderson, R. The curse of recursion: Training on generated data makes models forget.arXiv preprint arXiv:2305.17493,

  20. [20]

    However, at the scale of large language models, this approach is often computationally prohibitive

    can provide strong guarantees by retraining only the affected sub-models. However, at the scale of large language models, this approach is often computationally prohibitive. • Approximate unlearning:As a result, most practical work focuses on approximations, such as targetedgradient ascent updates that attempt to undo the learning signal by reducing the l...