arxiv: 2604.07328 · v2 · submitted 2026-04-08 · 💻 cs.LG

Recognition: no theorem link

How to sketch a learning algorithm

Sam Gunn

Authors on Pith no claims yet

Pith reviewed 2026-05-10 18:02 UTC · model grok-4.3

classification 💻 cs.LG

keywords data deletionstabilitydeep learningmodel sketchingautomatic differentiationinfluence estimationarithmetic circuit

0 comments

The pith

Stability lets deep learning models be sketched so data deletion effects can be predicted with vanishing error and logarithmic overhead.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish a practical way to answer how an AI model's outputs would change if specific training data had never been included, after modest precomputation. This matters because it would let practitioners remove data for privacy reasons, measure the influence of individual examples, and test hypotheses about what the model actually learned. The authors give algorithms that achieve any desired error ε with success probability 1-δ while using only Õ(log(1/δ)/ε²) times the work of ordinary training and inference and storing that many model copies. The proof rests on a new assumption they call stability, which they argue is realistic for powerful models and unlike stricter conditions used in earlier work. They support the claim by verifying stability holds in small-scale experiments and by giving a technical construction that sketches the underlying arithmetic circuit of the learning algorithm.

Core claim

We present a data deletion scheme capable of predicting model outputs with vanishing error ε and failure probability δ in the deep learning setting. Our precomputation and prediction algorithms are only Õ(log(1/δ)/ε²) factors slower than regular training and inference, respectively. The storage requirements are those of Õ(log(1/δ)/ε²) models. Our proof is based on an assumption that we call stability. In contrast to the assumptions made by prior work, stability appears to be fully compatible with learning powerful AI models. In support of this, we show that stability is satisfied in a minimal set of experiments with microgpt. At a technical level, our work is based on a new method for the st

What carries the argument

Local sketching of an arithmetic circuit by computing higher-order derivatives in random complex directions with forward-mode automatic differentiation.

If this is right

Outputs after any data deletion can be approximated to error ε with probability 1-δ using precomputation and storage of Õ(log(1/δ)/ε²) models.
Each deletion prediction costs only that same logarithmic factor more than ordinary inference.
Stability is presented as sufficient for the guarantees and compatible with training high-capacity deep networks.
The construction works by reducing the deletion query to a local sketch of the learning algorithm viewed as an arithmetic circuit.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same sketching idea could be tested on influence-function calculations or on measuring how much any single training example affects downstream accuracy.
If stability can be checked or encouraged during training, production systems might offer routine data-deletion services without full retraining.
The complex-derivative sketching technique might extend to other differentiable programs whose behavior under input perturbations needs to be summarized cheaply.

Load-bearing premise

The training process must obey the stability property under which small changes to the data produce controlled changes in the learned model.

What would settle it

Train the method on a model where increasing the number of sketched copies fails to drive the prediction error below a fixed positive threshold, or exhibit a practical deep network that violates stability.

Figures

Figures reproduced from arXiv: 2604.07328 by Sam Gunn.

**Figure 2.** Figure 2: Taylor approximations to f(z 1D), where f = ϕ ◦ A. As in [PITH_FULL_IMAGE:figures/full_fig_p028_2.png] view at source ↗

**Figure 3.** Figure 3: Predicted loss obtained using Scheme 2 with the same A, ϕ, and D as in [PITH_FULL_IMAGE:figures/full_fig_p029_3.png] view at source ↗

read the original abstract

How does the choice of training data influence an AI model? This broad question is of central importance to interpretability, privacy, and basic science. At its technical core is the data deletion problem: after a reasonable amount of precomputation, quickly predict how the model would behave in a given situation if a given subset of training data had been excluded from the learning algorithm. We present a data deletion scheme capable of predicting model outputs with vanishing error $\varepsilon$ and failure probability $\delta$ in the deep learning setting. Our precomputation and prediction algorithms are only $\tilde{O}(\log(1/\delta)/\varepsilon^2)$ factors slower than regular training and inference, respectively. The storage requirements are those of $\tilde{O}(\log(1/\delta)/\varepsilon^2)$ models. Our proof is based on an assumption that we call stability. In contrast to the assumptions made by prior work, stability appears to be fully compatible with learning powerful AI models. In support of this, we show that stability is satisfied in a minimal set of experiments with microgpt. Our code is available at https://github.com/SamSpo1/microgpt-sketch. At a technical level, our work is based on a new method for locally sketching an arithmetic circuit by computing higher-order derivatives in random complex directions. Forward-mode automatic differentiation allows cheap computation of these derivatives.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a derivative-based sketching method for data deletion that runs with modest overhead under a stability assumption, but that assumption is only lightly tested on microgpt.

read the letter

The core new piece is a way to locally sketch an arithmetic circuit for a learning algorithm by taking higher-order derivatives in random complex directions via forward-mode AD. This lets them build precomputation and query procedures that predict model behavior after data removal with error ε and failure δ, at Õ(log(1/δ)/ε²) slowdown relative to normal training and inference, plus storage for that many models. The code release and the microgpt runs are concrete and helpful for checking the basic mechanics. The derivation of the bounds from the stability assumption looks clean on paper, and the approach sits on top of existing influence and sketching ideas without obvious circularity in the math itself. The main soft spot is exactly where the stress test flags it: stability is introduced as the load-bearing condition, described as compatible with strong models, yet supported only by a minimal set of microgpt experiments. There is no general argument or wider empirical sweep showing it holds for typical deep nets on real data, so the vanishing-error claims stay conditional. If stability fails outside those toy runs, the overhead numbers and deletion guarantees do not transfer. The citation pattern looks standard for this area and does not hide gaps. This is for readers already working on data influence, privacy auditing, or efficient verification of ML models; someone building deletion tools or studying stability in optimization would get the most out of the sketching technique and the open implementation. It is coherent enough and specific enough on its own terms to deserve a serious referee, who can press on the assumption and ask for broader tests. I would send it to review rather than desk reject.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes a data deletion scheme for deep learning that, after precomputation, predicts model outputs as if a given subset of training data had been excluded. It achieves vanishing error ε with failure probability δ under a newly introduced 'stability' assumption, using a sketching technique based on higher-order derivatives of arithmetic circuits evaluated in random complex directions via forward-mode automatic differentiation. Precomputation and prediction incur only Õ(log(1/δ)/ε²) overhead relative to standard training and inference, with storage requirements of Õ(log(1/δ)/ε²) models. The stability assumption is claimed to be compatible with powerful models and is supported by a minimal set of microgpt experiments.

Significance. If the stability assumption holds for general deep networks and training regimes, the result would offer an efficient, theoretically grounded approach to data influence queries with direct relevance to privacy (right to be forgotten), interpretability, and scientific understanding of training data effects. The circuit-sketching method via complex directional derivatives is technically novel and could extend beyond deletion tasks. The efficiency bounds are attractive when the assumption is realistic, and the open-sourced microgpt code is a positive step toward reproducibility.

major comments (3)

[Abstract / proof of main theorem] The vanishing ε/δ guarantees and the Õ(log(1/δ)/ε²) overhead claims in the main result rest entirely on the stability assumption introduced in the proof. The manuscript provides no general theoretical argument establishing that stability holds for typical deep learning architectures; it is supported only by a minimal set of microgpt experiments whose quantitative verification details are not elaborated.
[Technical core (sketching and bounds)] The error bounds derivation (higher-order derivative sketching) is stated to yield the claimed factors only when stability holds; no sensitivity analysis, alternative bounds, or discussion of what occurs when the assumption is violated appears in the manuscript, making the central claims conditional rather than unconditional.
[Experiments] The empirical support for stability being 'fully compatible with learning powerful AI models' is limited to microgpt, a small-scale model. This does not constitute broad evidence for the assumption in the deep learning setting referenced in the abstract, undermining the applicability of the deletion scheme beyond the reported tests.

minor comments (2)

[Abstract and experiments] Clarify the precise definition and quantitative verification procedure for the stability assumption (e.g., how it is measured in the microgpt runs) so readers can assess its plausibility independently.
[Code and experiments] The GitHub link is welcome, but the manuscript should include a brief description of the experimental protocol used to check stability rather than referring readers solely to the code.

Simulated Author's Rebuttal

3 responses · 2 unresolved

We thank the referee for the constructive and detailed comments, which help clarify the scope and limitations of our work. We address each major comment point by point below, with revisions planned where appropriate to improve transparency without overstating the results.

read point-by-point responses

Referee: [Abstract / proof of main theorem] The vanishing ε/δ guarantees and the Õ(log(1/δ)/ε²) overhead claims in the main result rest entirely on the stability assumption introduced in the proof. The manuscript provides no general theoretical argument establishing that stability holds for typical deep learning architectures; it is supported only by a minimal set of microgpt experiments whose quantitative verification details are not elaborated.

Authors: We agree that the main theorem's guarantees are conditional on the stability assumption, which is introduced explicitly in the proof and abstract. The manuscript does not provide a general theoretical argument that stability holds for arbitrary deep learning architectures, as developing such a proof would constitute a separate and substantial contribution outside the scope of the sketching technique. The experiments are described as minimal, and we will revise the manuscript to elaborate on the quantitative verification details (e.g., specific metrics used to confirm the stability parameter bounds in the microgpt runs). This will make the empirical support more precise while retaining the conditional framing of the claims. revision: partial
Referee: [Technical core (sketching and bounds)] The error bounds derivation (higher-order derivative sketching) is stated to yield the claimed factors only when stability holds; no sensitivity analysis, alternative bounds, or discussion of what occurs when the assumption is violated appears in the manuscript, making the central claims conditional rather than unconditional.

Authors: The error bounds for the higher-order derivative sketching are derived under stability, as this controls the remainder terms in the complex directional derivatives. We acknowledge the lack of sensitivity analysis or discussion of violations in the current manuscript. In revision, we will add a dedicated paragraph in the technical section analyzing how the bounds degrade if stability is mildly violated (e.g., polynomial dependence on the stability parameter) and note that the Õ(log(1/δ)/ε²) factors no longer hold unconditionally. This will make the conditional nature of the claims explicit without altering the core derivation. revision: yes
Referee: [Experiments] The empirical support for stability being 'fully compatible with learning powerful AI models' is limited to microgpt, a small-scale model. This does not constitute broad evidence for the assumption in the deep learning setting referenced in the abstract, undermining the applicability of the deletion scheme beyond the reported tests.

Authors: The experiments are restricted to microgpt, as stated, and serve only as a minimal demonstration that stability can hold in a multi-layer transformer-like setting with attention. We do not present them as broad evidence for general deep learning. We will revise the abstract, introduction, and a new limitations paragraph to rephrase the compatibility claim more cautiously, stressing that stability must be checked for any target architecture and training procedure. This will better align the language with the limited scope of the tests. revision: partial

standing simulated objections not resolved

A general theoretical argument establishing that stability holds for typical deep learning architectures
Broad empirical evidence for stability on large-scale deep learning models beyond the microgpt experiments

Circularity Check

0 steps flagged

No significant circularity; bounds derived conditionally from explicit assumption

full rationale

The paper derives its Õ(log(1/δ)/ε²) data-deletion scheme and vanishing-error guarantees from a new sketching technique based on higher-order derivatives of arithmetic circuits in random complex directions. The central proof is explicitly conditional on a posited stability assumption, which is introduced as an external premise rather than derived from the target bounds. The assumption is then checked empirically in a minimal microgpt setting, but this verification step does not create a definitional loop or rename a fitted quantity as a prediction. No self-citations, uniqueness theorems, or ansatzes from prior author work are invoked as load-bearing steps in the provided text. The derivation chain therefore remains self-contained and does not reduce to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests primarily on the stability assumption as the key premise enabling the error bounds, with the sketching method as the technical contribution; no free parameters or new entities are introduced in the abstract.

axioms (1)

domain assumption Stability assumption for the learning algorithm
The proof of the data deletion scheme is based on this assumption, which the authors claim is compatible with powerful models.

pith-pipeline@v0.9.0 · 5533 in / 1376 out tokens · 79122 ms · 2026-05-10T18:02:30.690086+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

19 extracted references · 5 canonical work pages

[1]

The privacy onion effect: Memorization is relative

[CJZ+22] Nicholas Carlini, Matthew Jagielski, Chiyuan Zhang, Nicolas Papernot, An- dreas Terzis, and Florian Tram` er. The privacy onion effect: Memorization is relative. In Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, and A. Oh, editors,Advances in Neural Information Processing Sys- tems 35: Annual Conference on Neural Information Pro...

2022
[2]

Forget unlearning: Towards true data dele- tion in machine learning

[CS23] Rishav Chourasia and Neil Shah. Forget unlearning: Towards true data dele- tion in machine learning. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors,In- ternational Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, Proceedings of Machine Learning Res...

2023
[3]

Smith, Marika Swanberg, and Prashant Nalini Va- sudevan

[CSSV23] Aloni Cohen, Adam D. Smith, Marika Swanberg, and Prashant Nalini Va- sudevan. Control, confidentiality, and the right to be forgotten. In Weizhi Meng, Christian Damsgaard Jensen, Cas Cremers, and Engin Kirda, editors, Proceedings of the 2023 ACM SIGSAC Conference on Computer and Commu- nications Security, CCS 2023, Copenhagen, Denmark, November 2...

2023
[4]

When machine unlearning jeopardizes privacy

[CZW+21] Min Chen, Zhikun Zhang, Tianhao Wang, Michael Backes, Mathias Humbert, and Yang Zhang. When machine unlearning jeopardizes privacy. In Yongdae Kim, Jong Kim, Giovanni Vigna, and Elaine Shi, editors,CCS ’21: 2021 ACM SIGSAC Conference on Computer and Communications Security, Vir- tual Event, Republic of Korea, November 15 - 19, 2021, pages 896–911. ACM,

2021
[5]

Engstrom, A

[EIC+25] Logan Engstrom, Andrew Ilyas, Benjamin Chen, Axel Feldmann, William Moses, and Aleksander Madry. Optimizing ML training with metagradient descent.CoRR, abs/2503.13751,

work page arXiv
[6]

Hannun, and Laurens van der Maaten

30 [GGHvdM20] Chuan Guo, Tom Goldstein, Awni Y. Hannun, and Laurens van der Maaten. Certified data removal from machine learning models. InProceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, Proceedings of Machine Learning Research, pages 3832–

2020
[7]

Formalizing data deletion in the context of the right to be forgotten

[GGV20] Sanjam Garg, Shafi Goldwasser, and Prashant Nalini Vasudevan. Formalizing data deletion in the context of the right to be forgotten. In Anne Canteaut and Yuval Ishai, editors,Advances in Cryptology - EUROCRYPT 2020 - 39th Annual International Conference on the Theory and Applications of Crypto- graphic Techniques, Zagreb, Croatia, May 10-14, 2020,...

2020
[8]

Guan, Gregory Valiant, and James Zou

[GGVZ19] Antonio Ginart, Melody Y. Guan, Gregory Valiant, and James Zou. Making AI forget you: Data deletion in machine learning. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alch´ e-Buc, Emily B. Fox, and Ro- man Garnett, editors,Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processin...

2019
[9]

The Church of the Symmetric Subspace

[Har13] Aram Harrow. The church of the symmetric subspace.CoRR, abs/1308.6595,

work page Pith review arXiv
[10]

MAGIC: Near-optimal data attribution for deep learning.CoRR, abs/2504.16430,

[IE25] Andrew Ilyas and Logan Engstrom. MAGIC: Near-optimal data attribution for deep learning.CoRR, abs/2504.16430,

work page arXiv
[11]

Data attribution at scale

[IGEP24] Andrew Ilyas, Kristian Georgiev, Logan Engstrom, and Sung Min Park. Data attribution at scale. Tutorial at ICML 2024,

2024
[12]

Datamodels: Predicting predictions from training data

[IPE+22] Andrew Ilyas, Sung Min Park, Logan Engstrom, Guillaume Leclerc, and Alek- sander Madry. Datamodels: Predicting predictions from training data. In Ka- malika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesv´ ari, Gang Niu, and Sivan Sabato, editors,International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, US...

2022
[13]

Ap- proximate data deletion from machine learning models

[ISCZ21] Zachary Izzo, Mary Anne Smart, Kamalika Chaudhuri, and James Zou. Ap- proximate data deletion from machine learning models. In Arindam Banerjee and Kenji Fukumizu, editors,The 24th International Conference on Artificial 31 Intelligence and Statistics, AISTATS 2021, April 13-15, 2021, Virtual Event, Proceedings of Machine Learning Research, pages ...

2021
[14]

Certified unlearning for neural networks

[KAJ+25] Anastasia Koloskova, Youssef Allouah, Animesh Jha, Rachid Guerraoui, and Sanmi Koyejo. Certified unlearning for neural networks. InForty-second International Conference on Machine Learning, ICML 2025, Vancouver, BC, Canada, July 13-19,

2025
[15]

github.com/karpathy/8627fe009c40f57531cb18360106ce95

Link:https://gist. github.com/karpathy/8627fe009c40f57531cb18360106ce95. [KATL19] Pang Wei Koh, Kai-Siang Ang, Hubert H. K. Teo, and Percy Liang. On the accuracy of influence functions for measuring group effects. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alch´ e-Buc, Emily B. Fox, and Roman Garnett, editors,Advances in Neural In...

2019
[16]

Understanding black-box predictions via influence functions

[KL17] Pang Wei Koh and Percy Liang. Understanding black-box predictions via influence functions. In Doina Precup and Yee Whye Teh, editors,Proceed- ings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017, Proceedings of Machine Learning Research, pages 1885–1894. PMLR,

2017
[17]

[PGI+23] Sung Min Park, Kristian Georgiev, Andrew Ilyas, Guillaume Leclerc, and Aleksander Madry

Link:https://arxiv.org/pdf/1908.10761#page=17. [PGI+23] Sung Min Park, Kristian Georgiev, Andrew Ilyas, Guillaume Leclerc, and Aleksander Madry. TRAK: Attributing model behavior at scale. In An- dreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors,International Conference on Machine Learning, ICML 2...

work page arXiv 1908
[18]

Under- standing influence functions and datamodels via harmonic analysis

[SGBA23] Nikunj Saunshi, Arushi Gupta, Mark Braverman, and Sanjeev Arora. Under- standing influence functions and datamodels via harmonic analysis. InThe Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5,

2023
[19]

Are we making progress in unlearning?

32 [TKP+24] Eleni Triantafillou, Peter Kairouz, Fabian Pedregosa, Jamie Hayes, Meghdad Kurmanji, Kairan Zhao, Vincent Dumoulin, J´ ulio Jacques J´ unior, Ioannis Mitliagkas, Jun Wan, Lisheng Sun-Hosoya, Sergio Escalera, Gintare Karolina Dziugaite, Peter Triantafillou, and Isabelle Guyon. Are we making progress in unlearning? Findings from the first NeurIP...

work page arXiv