arxiv: 2605.11592 · v1 · submitted 2026-05-12 · 💻 cs.LG · cs.AI· cs.CR

Recognition: 2 theorem links

· Lean Theorem

SoK: Unlearnability and Unlearning for Model Dememorization

Derui Wang, Mengying Zhang, Minhui Xue, Ruoxi Sun, Shuang Hao, Xiaoyu Xia

Pith reviewed 2026-05-13 01:40 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.CR

keywords unlearnabilitymachine unlearningmodel dememorizationdata privacycertified unlearningsystematization of knowledgemodel forgetting

0 comments

The pith

Unlearnability and unlearning both produce only shallow dememorization of sensitive data, but certified unlearning supplies the first theoretical guarantee on forgetting depth.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper integrates two stages of defense against models memorizing private information: unlearnability, which adds invisible perturbations to training data to lower its learnability, and unlearning, which erases acquired knowledge from a trained model. It demonstrates that both techniques currently deliver only shallow dememorization, allowing recovery of the data under minor weight changes, and that they interfere with each other in practice. The work supplies a single taxonomy covering both families of methods, runs empirical tests that expose their robustness limits and mutual effects, and derives the first formal bound on how deeply certified unlearning can erase information. A reader interested in machine-learning privacy would care because these findings indicate where current safeguards fall short and what is required to reach a reliably forgotten state for sensitive knowledge.

Core claim

Unlearnability at the data-release stage and unlearning at the post-training stage share the goal of dememorization yet both exhibit shallow effects that fail under perturbations; input noise from unlearnability can impair later unlearning while unlearning can restore knowledge hidden by unlearnability; certified unlearning, however, yields the first provable bound on dememorization depth.

What carries the argument

The theoretical guarantee on dememorization depth for models that have undergone certified unlearning, which formally bounds the extent to which sensitive information can be removed from model parameters.

If this is right

Perturbations introduced by unlearnability reduce the effectiveness of subsequent unlearning steps.
Unlearning can recover domain-level knowledge that unlearnability had attempted to conceal.
Deeper immemorization of sensitive data requires combining unlearnability and unlearning under formal certification rather than using either in isolation.
Without certification, both families of methods leave models vulnerable to recovery of the target data under small weight perturbations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Sequential application of unlearnability followed by certified unlearning could be tested as a practical pipeline for stronger end-to-end privacy.
The shallow-dememorization phenomenon may appear in related privacy tools such as differential privacy, suggesting a broader pattern to investigate.
The taxonomy and guarantee could be extended to federated or continual-learning settings where data removal requests arrive incrementally.

Load-bearing premise

That the empirical evaluations of leading methods and the observed interplay between unlearnability and unlearning generalize beyond the specific datasets and models tested in the study.

What would settle it

A counterexample in which a model processed by certified unlearning still permits reconstruction of the supposedly forgotten data at a level exceeding the derived depth bound would falsify the theoretical guarantee.

Figures

Figures reproduced from arXiv: 2605.11592 by Derui Wang, Mengying Zhang, Minhui Xue, Ruoxi Sun, Shuang Hao, Xiaoyu Xia.

**Figure 1.** Figure 1: An overview of the model dememorization framework within the ML model development lifecycle. unlearnability [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: Shallow unlearnability and shallow unlearning re [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: The taxonomy of model dememorization. For samples 𝑥𝑖 from the same class, the corresponding 𝛿𝑖 can be shared, yielding class-wise unlearnability noise. Similar to E-Max, E-Min methods optimize perturbations through gradient-based procedures. However, they have been applied more broadly across data modalities, including images [52, 74], text [101, 217], and audio [121, 207], as well as across tasks such as … view at source ↗

**Figure 4.** Figure 4: The test accuracy (%) of ViT-Tiny trained on un [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 6.** Figure 6: MIA on OPS (left:class-level; right:subset-level). [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 7.** Figure 7: MIA on PUE (left:class-level; right:subset-level). [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗

**Figure 8.** Figure 8: MIA on TUE (left:class-level; right:subset-level). [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗

**Figure 9.** Figure 9: Parametric robustness of unlearnability perturba [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗

**Figure 10.** Figure 10: Recovery attack against unlearned classifiers trained on UE-s from Table 14. Top: Recovery attack against the [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗

**Figure 11.** Figure 11: Recovery attack against unlearned classifiers using FT trained on UE-s from Table 14. [PITH_FULL_IMAGE:figures/full_fig_p012_11.png] view at source ↗

**Figure 12.** Figure 12: Recovery attack against unlearned models trained on Regtext. [PITH_FULL_IMAGE:figures/full_fig_p012_12.png] view at source ↗

**Figure 13.** Figure 13: Recovery attacks against unlearned classifiers trained on TUE from Table 15. Top: Recovery attack against the [PITH_FULL_IMAGE:figures/full_fig_p019_13.png] view at source ↗

**Figure 14.** Figure 14: Recovery attacks against unlearned classifiers trained on PUE from Table 16. Top: Recovery attack against the [PITH_FULL_IMAGE:figures/full_fig_p019_14.png] view at source ↗

**Figure 15.** Figure 15: Recovery attacks against unlearned classifiers trained on OPS from Table 17. Top: Recovery attack against the [PITH_FULL_IMAGE:figures/full_fig_p020_15.png] view at source ↗

**Figure 16.** Figure 16: Recovery attack against unlearned classifiers using FT trained on TUE from Table 15. [PITH_FULL_IMAGE:figures/full_fig_p020_16.png] view at source ↗

**Figure 17.** Figure 17: Recovery attack against unlearned classifiers using FT trained on PUE from Table 16. [PITH_FULL_IMAGE:figures/full_fig_p020_17.png] view at source ↗

**Figure 18.** Figure 18: Recovery attack against unlearned classifiers using FT trained on OPS from Table 17. [PITH_FULL_IMAGE:figures/full_fig_p020_18.png] view at source ↗

**Figure 19.** Figure 19: The test accuracy (%) of ResNet-18 trained on vary [PITH_FULL_IMAGE:figures/full_fig_p022_19.png] view at source ↗

read the original abstract

Advanced model dememorization methods, including availability poisoning (unlearnability) and machine unlearning, are emerging as key safeguards against data misuse in machine learning (ML). At the training stage, unlearnability embeds imperceptible perturbations into data before release to reduce learnability. At the post-training stage, unlearning removes previously acquired information from models to prevent unauthorized disclosure or use. While both defenses aim to preserve the right to withhold knowledge, their vulnerabilities and shared foundations remain unclear. Specifically, both unlearnability and unlearning suffer from issues such as shallow dememorization, leading to falsely claimed data learnability reduction or forgetting in the presence of weight perturbations. Moreover, input perturbations may affect the effectiveness of downstream unlearning, while unlearning may inadvertently recover domain knowledge hidden by unlearnability. This interplay calls for deeper investigation. Finally, there is a lack of formal guarantees to provide theoretical insights into current defenses against shallow dememorization. In this Systematization of Knowledge, we present the first integrated analysis of model dememorization approaches leveraging unlearnability and unlearning. Our contributions are threefold: (i) a unified taxonomy of unlearnability and scalable unlearning methods; (ii) an empirical evaluation revealing the robustness, interplay, and shallow dememorization of leading methods; and (iii) the first theoretical guarantee on dememorization depth for models processed through certified unlearning. These results lay the foundation for unifying dememorization mechanisms across the ML lifecycle to achieve a deeper immemor state for sensitive knowledge.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This SoK connects unlearnability and unlearning in one framework and flags their interplay, but the claimed theoretical guarantee on dememorization depth rests on exact certification that practical methods do not deliver.

read the letter

The main takeaway is that the authors have mapped training-time unlearnability (via poisoning) and post-training unlearning into a single picture, which surfaces real interactions that separate surveys missed. They organize the methods into a unified taxonomy, run experiments on leading approaches to show shallow dememorization in both and how perturbations from one stage affect the other, and supply a theoretical bound on dememorization depth after certified unlearning. The taxonomy and the explicit treatment of the interplay are the parts that hold up best; they cite the prior literature cleanly and make a practical case that current defenses often leave models too close to the original behavior after small weight changes. The empirical section at least attempts to test cross-effects on standard datasets rather than treating the two defenses in isolation. The soft spot is the theoretical guarantee. It assumes exact removal of influence or ideal privacy parameters, yet real certified unlearning relies on approximations such as influence-function estimates or finite-sample differential privacy. Those leave residual effects, so the depth bound does not transfer to the approximate setting the paper itself identifies as the core problem. The empirical claims would also be easier to assess with fuller reporting of statistical controls and additional model scales, though the direction of the findings is consistent with known limitations in the field. This paper is for people working on ML privacy mechanisms and data-protection practices who need a consolidated view of the current toolkit and its gaps. It deserves peer review because the systematization is substantive and the interplay question is worth referee input, even if the theory section will need tightening to match practical conditions.

Referee Report

2 major / 2 minor

Summary. The manuscript is a Systematization of Knowledge (SoK) on model dememorization that integrates availability poisoning (unlearnability) at training time with post-training machine unlearning. It contributes (i) a unified taxonomy of unlearnability and scalable unlearning methods, (ii) empirical evaluations of leading methods that examine robustness, the interplay between the two defenses, and the problem of shallow dememorization, and (iii) the first theoretical guarantee on dememorization depth for models processed by certified unlearning.

Significance. If the empirical results on shallow dememorization and cross-stage interplay are reproducible and the theoretical guarantee holds under realistic conditions, the work would provide a valuable organizing framework for dememorization research across the ML lifecycle. The systematization itself organizes a fragmented literature; the attempt to supply a formal depth bound is a positive step toward moving beyond purely empirical claims.

major comments (2)

[§5 (Theoretical Guarantee)] §5 (Theoretical Guarantee): The main theorem establishing the dememorization-depth bound assumes exact certification (perfect influence removal or exact privacy parameters). Certified unlearning procedures in practice rely on approximations (influence-function estimates, finite-sample DP, or gradient-based surrogates). The manuscript does not show that the bound survives these approximations, which directly weakens its applicability to the shallow-dememorization phenomenon identified in the empirical sections.
[§4 (Empirical Evaluation)] §4 (Empirical Evaluation): The claims that leading unlearnability and unlearning methods exhibit shallow dememorization and that input perturbations affect downstream unlearning rest on evaluations whose baselines, statistical controls, and exact metrics are not fully specified. Without these details it is impossible to assess whether the reported interplay generalizes or is an artifact of the chosen datasets and models.

minor comments (2)

[Abstract] The abstract states that the work presents 'the first theoretical guarantee' but does not indicate the precise form of the bound or the key assumptions; adding one sentence would improve clarity.
[§2 (Preliminaries)] Notation for 'dememorization depth' is introduced without an explicit equation reference in the early sections; a forward pointer to the definition used in the theorem would help readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and positive assessment of our SoK. We address each major comment below and will revise the manuscript accordingly to improve rigor and clarity.

read point-by-point responses

Referee: [§5 (Theoretical Guarantee)] §5 (Theoretical Guarantee): The main theorem establishing the dememorization-depth bound assumes exact certification (perfect influence removal or exact privacy parameters). Certified unlearning procedures in practice rely on approximations (influence-function estimates, finite-sample DP, or gradient-based surrogates). The manuscript does not show that the bound survives these approximations, which directly weakens its applicability to the shallow-dememorization phenomenon identified in the empirical sections.

Authors: We agree that the main theorem is stated under exact certification. This provides a clean first formal guarantee on dememorization depth, consistent with the theoretical literature on certified unlearning. To address the gap, the revised manuscript will include an extended analysis (new subsection in §5) that propagates approximation errors from influence estimates and finite-sample DP relaxations into the depth bound, yielding a relaxed but still non-trivial guarantee. This will directly connect the theory to the shallow-dememorization observations in the empirical sections. revision: yes
Referee: [§4 (Empirical Evaluation)] §4 (Empirical Evaluation): The claims that leading unlearnability and unlearning methods exhibit shallow dememorization and that input perturbations affect downstream unlearning rest on evaluations whose baselines, statistical controls, and exact metrics are not fully specified. Without these details it is impossible to assess whether the reported interplay generalizes or is an artifact of the chosen datasets and models.

Authors: We acknowledge that §4 would benefit from greater explicitness. In the revision we will expand the experimental protocol subsection to specify: (i) exact baseline implementations and hyper-parameters, (ii) the statistical tests and multiple-comparison corrections used, (iii) precise definitions of all metrics for shallow dememorization and cross-stage interplay, and (iv) additional controls and sensitivity checks across datasets and model scales. These additions will allow readers to evaluate generalizability directly. revision: yes

Circularity Check

0 steps flagged

No circularity: SoK paper presents taxonomy, evaluation, and guarantee without self-referential derivations.

full rationale

The paper is a systematization of knowledge offering a unified taxonomy, empirical study of methods, and a claimed first theoretical guarantee on dememorization depth under certified unlearning. No equations, predictions, or first-principles results are shown that reduce by construction to author-defined inputs, fitted parameters, or self-citation chains. The guarantee is positioned as a novel contribution based on analysis of existing certified unlearning procedures rather than tautological redefinition. Standard citations to prior unlearning and poisoning literature do not constitute load-bearing self-referential justification per the enumerated patterns. The work remains self-contained against external benchmarks without any reduction of claims to its own fitted values or renamed ansatzes.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work relies on standard ML assumptions about model training dynamics and the existence of certified unlearning procedures; no new free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption Certified unlearning procedures exist and can be applied to trained models
Invoked when stating the theoretical guarantee on dememorization depth.

pith-pipeline@v0.9.0 · 5594 in / 1139 out tokens · 34142 ms · 2026-05-13T01:40:01.976955+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
Theorem 1 (Bounds of dememorization depth) ... A_D∗(θ∗) ≤ inf{t | Pr_κ[A_D∗(θ̂+κ) ≤ t] ≥ q} ... where q := Φ(Φ^{-1}(q) + η/σ)
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear
Theorem 2 (Transfer of dememorization bounds between indistinguishable models) ... Pr[∥θ′−θ̂∥₂ ≤ η] ≥ e^{-ε}(Pr[∥θ−θ̂∥₂ ≤ η] − ζ)

Reference graph

Works this paper leans on

242 extracted references · 242 canonical work pages · 4 internal anchors

[1]

GDPR Article 17: Right to Erasure

2016. GDPR Article 17: Right to Erasure. gdpr-info.eu. https://gdpr-info.eu/art- 17-gdpr/

work page 2016
[2]

California Consumer Privacy Act (CCPA) & CPRA Overview

2023. California Consumer Privacy Act (CCPA) & CPRA Overview. California Department of Justice. https://www.oag.ca.gov/privacy/ccpa

work page 2023
[3]

CCPA/CPRA Regulations

2024. CCPA/CPRA Regulations. California Privacy Protection Agency. https://cppa.ca.gov/regulations/

work page 2024
[4]

Regulation (EU) 2024/1689: Artificial Intelligence Act

2024. Regulation (EU) 2024/1689: Artificial Intelligence Act. Official Journal of the European Union. https://eur-lex.europa.eu/eli/reg/2024/1689/oj/eng

work page 2024
[5]

Anna Ablove, Shreyas Chandrashekaran, Xiao Qiang, and Roya Ensafi. 2026. Characterizing the Implementation of Censorship Policies in Chinese LLM Services. InNDSS

work page 2026
[6]

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Floren- cia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. 2023. Gpt-4 Technical Report.arXiv preprint arXiv:2303.08774 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[7]

Sk Miraj Ahmed, Umit Yigit Basaran, Dripta S Raychaudhuri, Arindam Dutta, Rohit Kundu, Fahim Faisal Niloy, Basak Guler, and Amit K Roy-Chowdhury

work page
[8]

Towards Source-Free Machine Unlearning. InCVPR

work page
[9]

Silas Alberti, Kenan Hasanaliyev, Manav Shah, and Stefano Ermon. 2025. Data Unlearning in Diffusion Models. InICLR

work page 2025
[10]

Nasser Aldaghri, Hessam Mahdavifar, and Ahmad Beirami. 2021. Coded machine unlearning.IEEE Access9 (2021), 88137–88150

work page 2021
[11]

Youssef Allouah, Rachid Guerraoui, and Sanmi Koyejo. 2026. Distributional Machine Unlearning via Selective Data Removal. InICLR

work page 2026
[12]

Youssef Allouah, Joshua Kazdan, Rachid Guerraoui, and Sanmi Koyejo. 2025. The Utility and Complexity of In-and Out-of-Distribution Machine Unlearning. InICLR

work page 2025
[13]

Sadia Asif and Mohammad Mohammadi Amiri. 2026. OFMU: Optimization- Driven Framework for Machine Unlearning. InICLR

work page 2026
[14]

George-Octavian Bărbulescu and Peter Triantafillou. 2024. To each (textual sequence) its own: improving memorized-data unlearning in large language models. InICML

work page 2024
[15]

Umit Yigit Basaran, Sk Miraj Ahmed, Amit Roy-Chowdhury, and Basak Guler

work page
[16]

A Certified Unlearning Approach without Access to Source Data. InICML

work page
[17]

Shristi Das Biswas, Arani Roy, and Kaushik Roy. 2025. Cure: Concept unlearning via orthogonal representation editing in diffusion models. InNeurIPS

work page 2025
[18]

Jacob L Block, Aryan Mokhtari, and Sanjay Shakkottai. 2025. Machine Unlearn- ing under Overparameterization. InNeurIPS

work page 2025
[19]

Lucas Bourtoule, Varun Chandrasekaran, Christopher A Choquette-Choo, Hen- grui Jia, Adelin Travers, Baiwu Zhang, David Lie, and Nicolas Papernot. 2021. Machine unlearning. InIEEE SP

work page 2021
[20]

Alexander Brown, Nenad Tomasev, Jan Freyberg, Yuan Liu, Alan Karthike- salingam, and Jessica Schrouff. 2023. Detecting shortcut learning for fair medical AI using shortcut testing.Nature communications14, 1 (2023), 4314

work page 2023
[21]

Nhung Bui, Xinyang Lu, Rachael Hwee Ling Sim, See-Kiong Ng, and Bryan Kian Hsiang Low. 2026. How to Cure Newton for Unlearning Neural Networks? An Empirical Study from the Hessian Perspective. InICLR

work page 2026
[22]

Bochuan Cao, Changjiang Li, Ting Wang, Jinyuan Jia, Bo Li, and Jinghui Chen

work page
[23]

InNeurIPS

Impress: Evaluating the resilience of imperceptible perturbations against unauthorized data usage in diffusion-based generative ai. InNeurIPS

work page
[24]

Sungmin Cha, Sungjun Cho, Dasol Hwang, and Moontae Lee. 2025. Towards Robust and Parameter-Efficient Knowledge Unlearning for LLMs. InICLR

work page 2025
[25]

Chaochao Chen, Jiaming Zhang, Yuyuan Li, and Zhongxuan Han. 2024. One for all: A universal generator for concept unlearnability via multi-modal alignment. InICML

work page 2024
[26]

Hang Chen, Jiaying Zhu, Xinyu Yang, and Wenya Wang. 2026. CLUE: Conflict- guided Localization for LLM Unlearning Framework. InICLR

work page 2026
[27]

Min Chen, Weizhuo Gao, Gaoyang Liu, Kai Peng, and Chen Wang. 2023. Bound- ary unlearning: Rapid forgetting of deep networks via shifting the decision boundary. InCVPR

work page 2023
[28]

Min Chen, Zhikun Zhang, Tianhao Wang, Michael Backes, Mathias Humbert, and Yang Zhang. 2022. Graph unlearning. InCCS

work page 2022
[29]

Sizhe Chen, Geng Yuan, Xinwen Cheng, Yifan Gong, Minghai Qin, Yanzhi Wang, and Xiaolin Huang. 2023. Self-Ensemble Protection: Training Checkpoints Are Good Data Protectors. InICLR

work page 2023
[30]

Tianqi Chen, Shujian Zhang, and Mingyuan Zhou. 2025. Score Forgetting Distillation: A Swift, Data-Free Method for Machine Unlearning in Diffusion Models. InICLR

work page 2025
[31]

Xinrui Chen, Xu Cao, Jianhao Zhang, Pinlong Zhao, Di Gao, and Ou Wu. 2026. Robust LLM Unlearning via Post Judgment and Multi-round Thinking. InICLR

work page 2026
[32]

Jiali Cheng, George Dasoulas, Huan He, Chirag Agarwal, and Marinka Zitnik

work page
[33]

GNNDelete: A General Strategy for Unlearning in Graph Neural Networks. InICLR

work page
[34]

Jingpu Cheng, Ping Liu, Qianxiao Li, and CHI ZHANG. 2026. Machine Unlearn- ing under Retain–Forget Entanglement. InICLR

work page 2026
[35]

Xinwen Cheng, Zhehao Huang, Wenxin Zhou, Zhengbao He, Ruikai Yang, Ying- wen Wu, and Xiaolin Huang. 2026. Remaining-data-free machine unlearning by suppressing sample contribution. InICLR

work page 2026
[36]

Eli Chien, Haoyu Wang, Ziang Chen, and Pan Li. 2024. Certified machine unlearning via noisy stochastic gradient descent. InNeurIPS

work page 2024
[37]

Eli Chien, Haoyu Wang, Ziang Chen, and Pan Li. 2024. Langevin unlearning: A new perspective of noisy gradient descent for machine unlearning. InNeurIPS

work page 2024
[38]

Somnath Basu Roy Chowdhury, Krzysztof Marcin Choromanski, Arijit Se- hanobish, Kumar Avinava Dubey, and Snigdha Chaturvedi. 2025. Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning. In ICLR

work page 2025
[39]

Kaiyuan Deng, Gen Li, Yang Xiao, Bo Hui, and Xiaolong Ma. 2026. Forget Many, Forget Right: Scalable and Precise Concept Unlearning in Diffusion Models. In ICLR

work page 2026
[40]

Zonglin Di, Sixie Yu, Yevgeniy Vorobeychik, and Yang Liu. 2025. Adversarial Machine Unlearning. InICLR

work page 2025
[41]

Zonglin Di, Zhaowei Zhu, Jinghan Jia, Jiancheng Liu, Zafar Takhirov, Bo Jiang, Yuanshun Yao, Sijia Liu, and Yang Liu. 2026. Label smoothing improves machine unlearning. InICLR

work page 2026
[42]

Jingfeng Zhang Di Zhao, Hongsheng Hu, Philippe Fournier-Viger, Gillian Dob- bie, and Yun Sing Koh. 2026. UNLEARNING DURING TRAINING: DOMAIN- SPECIFIC GRADIENT ASCENT FOR DOMAIN GENERALIZATION. InICLR

work page 2026
[43]

Chenlu Ding, Jiancan Wu, Yancheng Yuan, Jinda Lu, Kai Zhang, Alex Su, Xiang Wang, and Xiangnan He. 2025. Unified Parameter-Efficient Unlearning for LLMs. InICLR

work page 2025
[44]

Junhao Dong, Hao Zhu, Yifei Zhang, Xinghua Qu, Yew-Soon Ong, and Piotr Koniusz. 2025. Machine unlearning via task simplex arithmetic. InNeurIPS

work page 2025
[45]

Alexey Dosovitskiy. 2020. An image is worth 16x16 words: Transformers for image recognition at scale.arXiv preprint arXiv:2010.11929(2020)

work page internal anchor Pith review Pith/arXiv arXiv 2020
[46]

Yonatan Dukler, Benjamin Bowman, Alessandro Achille, Aditya Golatkar, Ash- win Swaminathan, and Stefano Soatto. 2023. Safe: Machine unlearning with shard graphs. InICCV. 14

work page 2023
[47]

Cynthia Dwork. 2006. Differential privacy. InInternational Colloquium on Automata, Languages, and Programming

work page 2006
[48]

Taha Entesari, Arman Hatami, Rinat Khaziev, Anil Ramakrishna, and Mahyar Fazlyab. 2025. Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models. InNeurIPS

work page 2025
[49]

Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Müller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, et al. 2024. Scaling Rectified Flow Transformers for High-Resolution Image Synthesis. In ICML

work page 2024
[50]

Simone Facchiano, Stefano Saravalle, Matteo Migliarini, Edoardo De Matteis, Alessio Sampieri, Andrea Pilzer, Emanuele Rodolà, Indro Spinelli, Luca Franco, and Fabio Galasso. 2026. Video unlearning via low-rank refusal vector. InICLR

work page 2026
[51]

Chongyu Fan, Jiancheng Liu, Licong Lin, Jinghan Jia, Ruiqi Zhang, Song Mei, and Sijia Liu. 2025. Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning. InNeurIPS

work page 2025
[52]

Chongyu Fan, Jiancheng Liu, Yihua Zhang, Dennis Wei, Eric Wong, and Sijia Liu. 2024. SalUn: Empowering Machine Unlearning via Gradient-Based Weight Saliency in Both Image Classification and Generation. InInternational Conference on Learning Representations

work page 2024
[53]

Bin Fang, Bo Li, Shuang Wu, Shouhong Ding, Ran Yi, and Lizhuang Ma. 2024. Re-thinking data availability attacks against deep neural networks. InCVPR

work page 2024
[54]

XiaoHua Feng, Yuyuan Li, Chaochao Chen, Li Zhang, Longfei Li, JUN ZHOU, and Xiaolin Zheng. 2025. Controllable Unlearning for Image-to-Image Generative Models via𝜖-Constrained Optimization. InICLR

work page 2025
[55]

Liam Fowl, Micah Goldblum, Ping-yeh Chiang, Jonas Geiping, Wojciech Czaja, and Tom Goldstein. 2021. Adversarial examples make strong poisons. InNeurIPS

work page 2021
[56]

Shaopeng Fu, Fengxiang He, Yang Liu, Li Shen, and Dacheng Tao. 2022. Robust Unlearnable Examples: Protecting Data Privacy Against Adversarial Learning. InICLR

work page 2022
[57]

Chongyang Gao, Lixu Wang, Kaize Ding, Chenkai Weng, Xiao Wang, and Qi Zhu. 2025. On Large Language Model Continual Unlearning. InICLR

work page 2025
[58]

Robert Geirhos, Jörn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland Brendel, Matthias Bethge, and Felix A Wichmann. 2020. Shortcut learning in deep neural networks.Nature Machine Intelligence2, 11 (2020), 665–673

work page 2020
[59]

Kristian Georgiev, Roy Rinberg, Sung Min Park, Shivam Garg, Andrew Ilyas, Aleksander Madry, and Seth Neel. 2025. Attribute-to-delete: Machine unlearning via datamodel matching. InICLR

work page 2025
[60]

David Glukhov, Ilia Shumailov, Yarin Gal, Nicolas Papernot, and Vardan Papyan

work page
[61]

Position: Fundamental Limitations of LLM Censorship Necessitate New Approaches. InICML

work page
[62]

Vignesh Gokul and Shlomo Dubnov. 2024. Poscuda: Position based convolution for unlearnable audio datasets.arXiv preprint arXiv:2401.02135(2024)

work page arXiv 2024
[63]

Aditya Golatkar, Alessandro Achille, Avinash Ravichandran, Marzia Polito, and Stefano Soatto. 2021. Mixed-privacy forgetting in deep networks. InCVPR

work page 2021
[64]

Aditya Golatkar, Alessandro Achille, and Stefano Soatto. 2020. Eternal sunshine of the spotless net: Selective forgetting in deep networks. InCVPR

work page 2020
[65]

Chen Gong, Kecen Li, Jin Yao, and Tianhao Wang. 2025. TrajDeleter: Enabling Trajectory Forgetting in Offline Reinforcement Learning Agents. InNDSS

work page 2025
[66]

Xueluan Gong, Yuji Wang, Yanjiao Chen, Haocheng Dong, Yiming Li, Mengyuan Sun, Shuaike Li, Qian Wang, and Chen Chen. 2025. Armor: Shielding unlearnable examples against data augmentation.arXiv preprint arXiv:2501.08862(2025)

work page arXiv 2025
[67]

Laura Graves, Vineel Nagisetty, and Vijay Ganesh. 2021. Amnesiac machine learning. InAAAI

work page 2021
[68]

Hanlin Gu, Hong Xi Tae, Lixin Fan, and Chee Seng Chan. 2026. Towards Privacy-Guaranteed Label Unlearning in Vertical Federated Learning: Few-Shot Forgetting Without Disclosure. InICLR

work page 2026
[69]

Chuan Guo, Tom Goldstein, Awni Hannun, and Laurens Van Der Maaten. 2020. Certified data removal from machine learning models. InICML

work page 2020
[70]

Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al . 2025. Deepseek-r1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning.arXiv preprint arXiv:2501.12948(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[71]

Varun Gupta, Christopher Jung, Seth Neel, Aaron Roth, Saeed Sharifi-Malvajerdi, and Chris Waites. 2021. Adaptive machine unlearning. InNeurIPS

work page 2021
[72]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. InProceedings of the IEEE conference on computer vision and pattern recognition. 770–778

work page 2016
[73]

Pengfei He, Han Xu, Jie Ren, Yingqian Cui, Shenglai Zeng, Hui Liu, Charu C Aggarwal, and Jiliang Tang. 2024. Sharpness-Aware Data Poisoning Attack. In ICLR

work page 2024
[74]

Robert Hönig, Javier Rando, Nicholas Carlini, and Florian Tramèr. 2025. Ad- versarial Perturbations Cannot Reliably Protect Artists From Generative AI. In ICLR

work page 2025
[75]

Hsiang Hsu, Pradeep Niroula, Zichang He, Ivan Brugere, Freddy Lecue, and Chun-Fu Chen. 2025. The Unseen Threat: Residual Knowledge in Machine Unlearning under Perturbed Samples. InNeurIPS

work page 2025
[76]

Jinwei Hu, Zhenglin Huang, Xiangyu Yin, Wenjie Ruan, Guangliang Cheng, Yi Dong, and Xiaowei Huang. 2025. FALCON: Fine-grained Activation Manipu- lation by Contrastive Orthogonal Unalignment for Large Language Model. In NeurIPS

work page 2025
[77]

Shengyuan Hu, Yiwei Fu, Steven Wu, and Virginia Smith. 2025. Unlearning or Obfuscating? Jogging the Memory of Unlearned LLMs via Benign Relearning. InICLR

work page 2025
[78]

Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger

work page
[79]

InProceedings of the IEEE conference on computer vision and pattern recognition

Densely connected convolutional networks. InProceedings of the IEEE conference on computer vision and pattern recognition. 4700–4708

work page
[80]

Hanxun Huang, Xingjun Ma, Sarah Monazam Erfani, James Bailey, and Yisen Wang. 2021. Unlearnable Examples: Making Personal Data Unexploitable. In ICLR

work page 2021

Showing first 80 references.