Recognition: no theorem link
How to sketch a learning algorithm
Pith reviewed 2026-05-10 18:02 UTC · model grok-4.3
The pith
Stability lets deep learning models be sketched so data deletion effects can be predicted with vanishing error and logarithmic overhead.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We present a data deletion scheme capable of predicting model outputs with vanishing error ε and failure probability δ in the deep learning setting. Our precomputation and prediction algorithms are only Õ(log(1/δ)/ε²) factors slower than regular training and inference, respectively. The storage requirements are those of Õ(log(1/δ)/ε²) models. Our proof is based on an assumption that we call stability. In contrast to the assumptions made by prior work, stability appears to be fully compatible with learning powerful AI models. In support of this, we show that stability is satisfied in a minimal set of experiments with microgpt. At a technical level, our work is based on a new method for the st
What carries the argument
Local sketching of an arithmetic circuit by computing higher-order derivatives in random complex directions with forward-mode automatic differentiation.
If this is right
- Outputs after any data deletion can be approximated to error ε with probability 1-δ using precomputation and storage of Õ(log(1/δ)/ε²) models.
- Each deletion prediction costs only that same logarithmic factor more than ordinary inference.
- Stability is presented as sufficient for the guarantees and compatible with training high-capacity deep networks.
- The construction works by reducing the deletion query to a local sketch of the learning algorithm viewed as an arithmetic circuit.
Where Pith is reading between the lines
- The same sketching idea could be tested on influence-function calculations or on measuring how much any single training example affects downstream accuracy.
- If stability can be checked or encouraged during training, production systems might offer routine data-deletion services without full retraining.
- The complex-derivative sketching technique might extend to other differentiable programs whose behavior under input perturbations needs to be summarized cheaply.
Load-bearing premise
The training process must obey the stability property under which small changes to the data produce controlled changes in the learned model.
What would settle it
Train the method on a model where increasing the number of sketched copies fails to drive the prediction error below a fixed positive threshold, or exhibit a practical deep network that violates stability.
Figures
read the original abstract
How does the choice of training data influence an AI model? This broad question is of central importance to interpretability, privacy, and basic science. At its technical core is the data deletion problem: after a reasonable amount of precomputation, quickly predict how the model would behave in a given situation if a given subset of training data had been excluded from the learning algorithm. We present a data deletion scheme capable of predicting model outputs with vanishing error $\varepsilon$ and failure probability $\delta$ in the deep learning setting. Our precomputation and prediction algorithms are only $\tilde{O}(\log(1/\delta)/\varepsilon^2)$ factors slower than regular training and inference, respectively. The storage requirements are those of $\tilde{O}(\log(1/\delta)/\varepsilon^2)$ models. Our proof is based on an assumption that we call stability. In contrast to the assumptions made by prior work, stability appears to be fully compatible with learning powerful AI models. In support of this, we show that stability is satisfied in a minimal set of experiments with microgpt. Our code is available at https://github.com/SamSpo1/microgpt-sketch. At a technical level, our work is based on a new method for locally sketching an arithmetic circuit by computing higher-order derivatives in random complex directions. Forward-mode automatic differentiation allows cheap computation of these derivatives.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a data deletion scheme for deep learning that, after precomputation, predicts model outputs as if a given subset of training data had been excluded. It achieves vanishing error ε with failure probability δ under a newly introduced 'stability' assumption, using a sketching technique based on higher-order derivatives of arithmetic circuits evaluated in random complex directions via forward-mode automatic differentiation. Precomputation and prediction incur only Õ(log(1/δ)/ε²) overhead relative to standard training and inference, with storage requirements of Õ(log(1/δ)/ε²) models. The stability assumption is claimed to be compatible with powerful models and is supported by a minimal set of microgpt experiments.
Significance. If the stability assumption holds for general deep networks and training regimes, the result would offer an efficient, theoretically grounded approach to data influence queries with direct relevance to privacy (right to be forgotten), interpretability, and scientific understanding of training data effects. The circuit-sketching method via complex directional derivatives is technically novel and could extend beyond deletion tasks. The efficiency bounds are attractive when the assumption is realistic, and the open-sourced microgpt code is a positive step toward reproducibility.
major comments (3)
- [Abstract / proof of main theorem] The vanishing ε/δ guarantees and the Õ(log(1/δ)/ε²) overhead claims in the main result rest entirely on the stability assumption introduced in the proof. The manuscript provides no general theoretical argument establishing that stability holds for typical deep learning architectures; it is supported only by a minimal set of microgpt experiments whose quantitative verification details are not elaborated.
- [Technical core (sketching and bounds)] The error bounds derivation (higher-order derivative sketching) is stated to yield the claimed factors only when stability holds; no sensitivity analysis, alternative bounds, or discussion of what occurs when the assumption is violated appears in the manuscript, making the central claims conditional rather than unconditional.
- [Experiments] The empirical support for stability being 'fully compatible with learning powerful AI models' is limited to microgpt, a small-scale model. This does not constitute broad evidence for the assumption in the deep learning setting referenced in the abstract, undermining the applicability of the deletion scheme beyond the reported tests.
minor comments (2)
- [Abstract and experiments] Clarify the precise definition and quantitative verification procedure for the stability assumption (e.g., how it is measured in the microgpt runs) so readers can assess its plausibility independently.
- [Code and experiments] The GitHub link is welcome, but the manuscript should include a brief description of the experimental protocol used to check stability rather than referring readers solely to the code.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments, which help clarify the scope and limitations of our work. We address each major comment point by point below, with revisions planned where appropriate to improve transparency without overstating the results.
read point-by-point responses
-
Referee: [Abstract / proof of main theorem] The vanishing ε/δ guarantees and the Õ(log(1/δ)/ε²) overhead claims in the main result rest entirely on the stability assumption introduced in the proof. The manuscript provides no general theoretical argument establishing that stability holds for typical deep learning architectures; it is supported only by a minimal set of microgpt experiments whose quantitative verification details are not elaborated.
Authors: We agree that the main theorem's guarantees are conditional on the stability assumption, which is introduced explicitly in the proof and abstract. The manuscript does not provide a general theoretical argument that stability holds for arbitrary deep learning architectures, as developing such a proof would constitute a separate and substantial contribution outside the scope of the sketching technique. The experiments are described as minimal, and we will revise the manuscript to elaborate on the quantitative verification details (e.g., specific metrics used to confirm the stability parameter bounds in the microgpt runs). This will make the empirical support more precise while retaining the conditional framing of the claims. revision: partial
-
Referee: [Technical core (sketching and bounds)] The error bounds derivation (higher-order derivative sketching) is stated to yield the claimed factors only when stability holds; no sensitivity analysis, alternative bounds, or discussion of what occurs when the assumption is violated appears in the manuscript, making the central claims conditional rather than unconditional.
Authors: The error bounds for the higher-order derivative sketching are derived under stability, as this controls the remainder terms in the complex directional derivatives. We acknowledge the lack of sensitivity analysis or discussion of violations in the current manuscript. In revision, we will add a dedicated paragraph in the technical section analyzing how the bounds degrade if stability is mildly violated (e.g., polynomial dependence on the stability parameter) and note that the Õ(log(1/δ)/ε²) factors no longer hold unconditionally. This will make the conditional nature of the claims explicit without altering the core derivation. revision: yes
-
Referee: [Experiments] The empirical support for stability being 'fully compatible with learning powerful AI models' is limited to microgpt, a small-scale model. This does not constitute broad evidence for the assumption in the deep learning setting referenced in the abstract, undermining the applicability of the deletion scheme beyond the reported tests.
Authors: The experiments are restricted to microgpt, as stated, and serve only as a minimal demonstration that stability can hold in a multi-layer transformer-like setting with attention. We do not present them as broad evidence for general deep learning. We will revise the abstract, introduction, and a new limitations paragraph to rephrase the compatibility claim more cautiously, stressing that stability must be checked for any target architecture and training procedure. This will better align the language with the limited scope of the tests. revision: partial
- A general theoretical argument establishing that stability holds for typical deep learning architectures
- Broad empirical evidence for stability on large-scale deep learning models beyond the microgpt experiments
Circularity Check
No significant circularity; bounds derived conditionally from explicit assumption
full rationale
The paper derives its Õ(log(1/δ)/ε²) data-deletion scheme and vanishing-error guarantees from a new sketching technique based on higher-order derivatives of arithmetic circuits in random complex directions. The central proof is explicitly conditional on a posited stability assumption, which is introduced as an external premise rather than derived from the target bounds. The assumption is then checked empirically in a minimal microgpt setting, but this verification step does not create a definitional loop or rename a fitted quantity as a prediction. No self-citations, uniqueness theorems, or ansatzes from prior author work are invoked as load-bearing steps in the provided text. The derivation chain therefore remains self-contained and does not reduce to its inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Stability assumption for the learning algorithm
Reference graph
Works this paper leans on
-
[1]
The privacy onion effect: Memorization is relative
[CJZ+22] Nicholas Carlini, Matthew Jagielski, Chiyuan Zhang, Nicolas Papernot, An- dreas Terzis, and Florian Tram` er. The privacy onion effect: Memorization is relative. In Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, and A. Oh, editors,Advances in Neural Information Processing Sys- tems 35: Annual Conference on Neural Information Pro...
2022
-
[2]
Forget unlearning: Towards true data dele- tion in machine learning
[CS23] Rishav Chourasia and Neil Shah. Forget unlearning: Towards true data dele- tion in machine learning. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors,In- ternational Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, Proceedings of Machine Learning Res...
2023
-
[3]
Smith, Marika Swanberg, and Prashant Nalini Va- sudevan
[CSSV23] Aloni Cohen, Adam D. Smith, Marika Swanberg, and Prashant Nalini Va- sudevan. Control, confidentiality, and the right to be forgotten. In Weizhi Meng, Christian Damsgaard Jensen, Cas Cremers, and Engin Kirda, editors, Proceedings of the 2023 ACM SIGSAC Conference on Computer and Commu- nications Security, CCS 2023, Copenhagen, Denmark, November 2...
2023
-
[4]
When machine unlearning jeopardizes privacy
[CZW+21] Min Chen, Zhikun Zhang, Tianhao Wang, Michael Backes, Mathias Humbert, and Yang Zhang. When machine unlearning jeopardizes privacy. In Yongdae Kim, Jong Kim, Giovanni Vigna, and Elaine Shi, editors,CCS ’21: 2021 ACM SIGSAC Conference on Computer and Communications Security, Vir- tual Event, Republic of Korea, November 15 - 19, 2021, pages 896–911. ACM,
2021
-
[5]
[EIC+25] Logan Engstrom, Andrew Ilyas, Benjamin Chen, Axel Feldmann, William Moses, and Aleksander Madry. Optimizing ML training with metagradient descent.CoRR, abs/2503.13751,
-
[6]
Hannun, and Laurens van der Maaten
30 [GGHvdM20] Chuan Guo, Tom Goldstein, Awni Y. Hannun, and Laurens van der Maaten. Certified data removal from machine learning models. InProceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, Proceedings of Machine Learning Research, pages 3832–
2020
-
[7]
Formalizing data deletion in the context of the right to be forgotten
[GGV20] Sanjam Garg, Shafi Goldwasser, and Prashant Nalini Vasudevan. Formalizing data deletion in the context of the right to be forgotten. In Anne Canteaut and Yuval Ishai, editors,Advances in Cryptology - EUROCRYPT 2020 - 39th Annual International Conference on the Theory and Applications of Crypto- graphic Techniques, Zagreb, Croatia, May 10-14, 2020,...
2020
-
[8]
Guan, Gregory Valiant, and James Zou
[GGVZ19] Antonio Ginart, Melody Y. Guan, Gregory Valiant, and James Zou. Making AI forget you: Data deletion in machine learning. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alch´ e-Buc, Emily B. Fox, and Ro- man Garnett, editors,Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processin...
2019
-
[9]
The Church of the Symmetric Subspace
[Har13] Aram Harrow. The church of the symmetric subspace.CoRR, abs/1308.6595,
-
[10]
MAGIC: Near-optimal data attribution for deep learning.CoRR, abs/2504.16430,
[IE25] Andrew Ilyas and Logan Engstrom. MAGIC: Near-optimal data attribution for deep learning.CoRR, abs/2504.16430,
-
[11]
Data attribution at scale
[IGEP24] Andrew Ilyas, Kristian Georgiev, Logan Engstrom, and Sung Min Park. Data attribution at scale. Tutorial at ICML 2024,
2024
-
[12]
Datamodels: Predicting predictions from training data
[IPE+22] Andrew Ilyas, Sung Min Park, Logan Engstrom, Guillaume Leclerc, and Alek- sander Madry. Datamodels: Predicting predictions from training data. In Ka- malika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesv´ ari, Gang Niu, and Sivan Sabato, editors,International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, US...
2022
-
[13]
Ap- proximate data deletion from machine learning models
[ISCZ21] Zachary Izzo, Mary Anne Smart, Kamalika Chaudhuri, and James Zou. Ap- proximate data deletion from machine learning models. In Arindam Banerjee and Kenji Fukumizu, editors,The 24th International Conference on Artificial 31 Intelligence and Statistics, AISTATS 2021, April 13-15, 2021, Virtual Event, Proceedings of Machine Learning Research, pages ...
2021
-
[14]
Certified unlearning for neural networks
[KAJ+25] Anastasia Koloskova, Youssef Allouah, Animesh Jha, Rachid Guerraoui, and Sanmi Koyejo. Certified unlearning for neural networks. InForty-second International Conference on Machine Learning, ICML 2025, Vancouver, BC, Canada, July 13-19,
2025
-
[15]
github.com/karpathy/8627fe009c40f57531cb18360106ce95
Link:https://gist. github.com/karpathy/8627fe009c40f57531cb18360106ce95. [KATL19] Pang Wei Koh, Kai-Siang Ang, Hubert H. K. Teo, and Percy Liang. On the accuracy of influence functions for measuring group effects. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alch´ e-Buc, Emily B. Fox, and Roman Garnett, editors,Advances in Neural In...
2019
-
[16]
Understanding black-box predictions via influence functions
[KL17] Pang Wei Koh and Percy Liang. Understanding black-box predictions via influence functions. In Doina Precup and Yee Whye Teh, editors,Proceed- ings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017, Proceedings of Machine Learning Research, pages 1885–1894. PMLR,
2017
-
[17]
[PGI+23] Sung Min Park, Kristian Georgiev, Andrew Ilyas, Guillaume Leclerc, and Aleksander Madry
Link:https://arxiv.org/pdf/1908.10761#page=17. [PGI+23] Sung Min Park, Kristian Georgiev, Andrew Ilyas, Guillaume Leclerc, and Aleksander Madry. TRAK: Attributing model behavior at scale. In An- dreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors,International Conference on Machine Learning, ICML 2...
-
[18]
Under- standing influence functions and datamodels via harmonic analysis
[SGBA23] Nikunj Saunshi, Arushi Gupta, Mark Braverman, and Sanjeev Arora. Under- standing influence functions and datamodels via harmonic analysis. InThe Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5,
2023
-
[19]
Are we making progress in unlearning?
32 [TKP+24] Eleni Triantafillou, Peter Kairouz, Fabian Pedregosa, Jamie Hayes, Meghdad Kurmanji, Kairan Zhao, Vincent Dumoulin, J´ ulio Jacques J´ unior, Ioannis Mitliagkas, Jun Wan, Lisheng Sun-Hosoya, Sergio Escalera, Gintare Karolina Dziugaite, Peter Triantafillou, and Isabelle Guyon. Are we making progress in unlearning? Findings from the first NeurIP...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.