arxiv: 2310.12508 · v5 · submitted 2023-10-19 · 💻 cs.LG · cs.AI

Recognition: 2 theorem links

· Lean Theorem

SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation

Chongyu Fan , Jiancheng Liu , Yihua Zhang , Eric Wong , Dennis Wei , Sijia Liu

Authors on Pith no claims yet

Pith reviewed 2026-05-16 17:52 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords machine unlearningweight saliencygradient methodsimage classificationdiffusion modelsdata erasuremodel safetyprivacy compliance

0 comments

The pith

Gradient-based weight saliency enables effective unlearning of data, classes, or concepts in both image classifiers and generators while approaching exact retraining performance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Machine unlearning removes the influence of specific training data from a model to satisfy privacy rules or safety requirements. Prior methods often trade off accuracy, stability, or the ability to handle both classification and generation tasks. The paper introduces weight saliency computed from gradients to focus updates on only the parameters most tied to the data being forgotten. This produces the SalUn method that narrows the performance gap to full retraining from scratch. A reader would care because it offers a more practical route to compliance with data-deletion requests across common vision models.

Core claim

SalUn computes weight saliency by examining gradients of the forgetting data and then applies an optimization step that updates primarily those salient weights. The result erases targeted information in image classification models and prevents conditional diffusion models from generating specified concepts. Experiments show stability advantages on random data removal and near-100 percent unlearning accuracy on harmful-image prevention tasks, all while preserving accuracy on retained data.

What carries the argument

Gradient-based weight saliency, which ranks model parameters by their gradient magnitude or influence on the forgetting objective and thereby restricts unlearning updates to those parameters.

Load-bearing premise

Gradient information reliably isolates the exact parameters responsible for the forgetting data without causing large unintended changes to retained knowledge.

What would settle it

If a model processed by SalUn still classifies forgotten classes or generates forbidden concepts at rates close to the original trained model on held-out test examples, the claim of effective unlearning would be refuted.

read the original abstract

With evolving data regulations, machine unlearning (MU) has become an important tool for fostering trust and safety in today's AI models. However, existing MU methods focusing on data and/or weight perspectives often suffer limitations in unlearning accuracy, stability, and cross-domain applicability. To address these challenges, we introduce the concept of 'weight saliency' for MU, drawing parallels with input saliency in model explanation. This innovation directs MU's attention toward specific model weights rather than the entire model, improving effectiveness and efficiency. The resultant method that we call saliency unlearning (SalUn) narrows the performance gap with 'exact' unlearning (model retraining from scratch after removing the forgetting data points). To the best of our knowledge, SalUn is the first principled MU approach that can effectively erase the influence of forgetting data, classes, or concepts in both image classification and generation tasks. As highlighted below, For example, SalUn yields a stability advantage in high-variance random data forgetting, e.g., with a 0.2% gap compared to exact unlearning on the CIFAR-10 dataset. Moreover, in preventing conditional diffusion models from generating harmful images, SalUn achieves nearly 100% unlearning accuracy, outperforming current state-of-the-art baselines like Erased Stable Diffusion and Forget-Me-Not. Codes are available at https://github.com/OPTML-Group/Unlearn-Saliency. (WARNING: This paper contains model outputs that may be offensive in nature.)

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SalUn's gradient weight saliency gives a practical way to unlearn in both classifiers and diffusion models with small gaps to exact retraining, but the isolation of parameters from shared features needs tighter checks.

read the letter

The main point is that SalUn computes gradient-based saliency on forgetting data to pick which weights to update, and this gets the method close to full retraining on CIFAR-10 (0.2% gap) while reaching near-100% unlearning accuracy on stopping harmful outputs from conditional diffusion models. It beats the listed baselines and works on both classification and generation, which prior MU work had not shown together in one approach. The code release helps with checking the implementation directly. The analogy to input saliency is straightforward and the empirical margins look usable for privacy or safety settings. The soft spot is the assumption that top-k salient weights from forgetting gradients affect only the target influence. In ResNet or U-Net style backbones, features are shared, so gradients on forgetting samples can easily flag weights that also matter for retained classes or concepts. The reported metrics track unlearning success and retained performance but do not directly measure weight overlap across different forgetting sets or test how results change with saliency threshold or random seeds. Without those checks, collateral effects remain possible even if the headline numbers look good. This paper is for groups working on practical machine unlearning for regulatory compliance or safe generation. Readers who need a method that handles both data removal and concept erasure will find the cross-domain results and the released code useful. It has enough concrete results and a clear algorithmic step to deserve a serious referee, though the review should focus on verifying the saliency isolation and adding stability details. I would send it to peer review.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes SalUn, a machine unlearning method that computes gradient-based weight saliency on forgetting data to identify and selectively update a subset of model parameters, thereby erasing the influence of specific data points, classes, or concepts. It reports empirical results on image classification (0.2% gap to exact unlearning on CIFAR-10) and conditional diffusion models (near-100% unlearning accuracy, outperforming Erased Stable Diffusion and Forget-Me-Not), claiming to be the first principled approach effective for both domains.

Significance. If the isolation of salient weights holds, the work offers a unified, efficient framework for machine unlearning that narrows the gap to exact retraining while extending to generative models; the open-sourced code at the provided GitHub link is a clear strength that supports reproducibility and further testing.

major comments (2)

[§3.2] §3.2, Eq. (3): the top-k gradient saliency computed solely on forgetting samples assumes these weights encode the target influence in isolation; however, in shared-backbone networks (ResNet/VGG for classification, U-Net for diffusion), gradients may highlight parameters also used by retained classes/concepts, and the manuscript provides no direct measurement of salient-set overlap or degradation when thresholds vary.
[Experiments] Experimental section: reported gaps (0.2% on CIFAR-10, near-100% on diffusion) are presented without error bars, seed-wise stability checks on the saliency computation, or ablations on the saliency threshold k; these omissions make it difficult to confirm that the small advantage over baselines is robust rather than sensitive to initialization or hyperparameter choice.

minor comments (2)

[Abstract] Abstract: the 0.2% stability advantage is stated without naming the exact metric (accuracy? loss?) or reporting variability; add this detail for clarity.
[§3] Notation: ensure consistent use of symbols for saliency scores and update rules across equations and text; a short table summarizing symbols would aid readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment point by point below and have revised the paper to incorporate additional analyses for greater clarity and robustness.

read point-by-point responses

Referee: [§3.2] §3.2, Eq. (3): the top-k gradient saliency computed solely on forgetting samples assumes these weights encode the target influence in isolation; however, in shared-backbone networks (ResNet/VGG for classification, U-Net for diffusion), gradients may highlight parameters also used by retained classes/concepts, and the manuscript provides no direct measurement of salient-set overlap or degradation when thresholds vary.

Authors: We appreciate this observation regarding the potential for parameter overlap in shared-backbone architectures. While the saliency computation focuses on forgetting samples to identify the most affected weights, we acknowledge that some shared parameters may exist. To directly address this, the revised manuscript now includes a quantitative analysis of the overlap between the top-k salient weight sets derived from forgetting data versus retained data (or concepts). This overlap is measured across the classification and diffusion experiments and shown to be limited, supporting the targeted nature of the updates. We have also added an ablation on varying the threshold k, reporting both unlearning effectiveness and any degradation in retained performance to demonstrate stability within the chosen operating range. revision: yes
Referee: [Experiments] Experimental section: reported gaps (0.2% on CIFAR-10, near-100% on diffusion) are presented without error bars, seed-wise stability checks on the saliency computation, or ablations on the saliency threshold k; these omissions make it difficult to confirm that the small advantage over baselines is robust rather than sensitive to initialization or hyperparameter choice.

Authors: We agree that the absence of error bars, multi-seed checks, and k ablations limits the ability to assess robustness. The revised experimental section now reports results averaged over multiple random seeds (with standard deviations shown as error bars) for the primary metrics on CIFAR-10 and the conditional diffusion models. We have also added a dedicated ablation study on the saliency threshold k, illustrating how performance varies with different k values and confirming that the reported gaps to exact unlearning remain stable and small within the selected range. These additions substantiate that the advantages are not artifacts of a single initialization or hyperparameter setting. revision: yes

Circularity Check

0 steps flagged

No significant circularity: SalUn is a new algorithmic construction validated against external baselines.

full rationale

The paper introduces gradient-based weight saliency (Eq. 3 in §3.2) as a novel MU procedure that computes top-k salient weights from forgetting-data gradients and applies targeted updates. Performance is measured directly against exact retraining from scratch and prior MU baselines (e.g., Erased Stable Diffusion) on CIFAR-10, ImageNet, and diffusion models, with reported gaps (0.2% stability) and unlearning accuracy (~100%). No equation reduces the claimed improvement to a fitted hyperparameter by definition, no self-citation chain justifies the core premise, and the uniqueness claim is presented as an empirical observation rather than a theorem derived from prior author work. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The method rests on the domain assumption that gradients with respect to forgetting data identify the weights whose modification will remove influence without global side effects; no new physical entities or free parameters are introduced beyond standard optimization hyperparameters.

axioms (1)

domain assumption Gradient-based saliency computed on forgetting data isolates the relevant model weights for unlearning
Core premise stated in the introduction of the weight-saliency concept.

invented entities (1)

weight saliency no independent evidence
purpose: Directs unlearning updates to specific weights rather than the entire model
New conceptual construct introduced by the paper; no independent falsifiable prediction supplied beyond the empirical results.

pith-pipeline@v0.9.0 · 5590 in / 1278 out tokens · 71419 ms · 2026-05-16T17:52:24.110584+00:00 · methodology

discussion (0)

Forward citations

Cited by 17 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Unlearning with Asymmetric Sources: Improved Unlearning-Utility Trade-off with Public Data
cs.LG 2026-05 unverdicted novelty 7.0

Asymmetric Langevin Unlearning uses public data to suppress unlearning noise costs by O(1/n_pub²), enabling practical mass unlearning with preserved utility under distribution mismatch.
Classification-Head Bias in Class-Level Machine Unlearning: Diagnosis, Mitigation, and Evaluation
cs.LG 2026-05 conditional novelty 7.0

Class-level unlearning shortcuts via bias suppression in the classification head; new bias-aware training mechanisms and bias-specific metrics are introduced to diagnose and reduce this dependence.
Erase Persona, Forget Lore: Benchmarking Multimodal Copyright Unlearning in Large Vision Language Models
cs.CV 2026-05 unverdicted novelty 7.0

CoVUBench is the first benchmark framework for evaluating multimodal copyright unlearning in LVLMs via synthetic data, systematic variations, and a dual protocol for forgetting efficacy and utility preservation.
Efficient Unlearning through Maximizing Relearning Convergence Delay
cs.LG 2026-04 unverdicted novelty 7.0

The Influence Eliminating Unlearning framework maximizes relearning convergence delay via weight decay and noise injection to remove the influence of a forgetting set while preserving accuracy on retained data.
Is your algorithm unlearning or untraining?
cs.LG 2026-04 conditional novelty 7.0

Machine unlearning conflates reversing the influence of specific training examples (untraining) with removing the full underlying distribution or behavior (unlearning).
CURE:Circuit-Aware Unlearning for LLM-based Recommendation
cs.IR 2026-04 unverdicted novelty 7.0

CURE disentangles LLM recommendation circuits into forget-specific, retain-specific, and task-shared modules with tailored update rules to achieve more effective unlearning than weighted baselines.
Null Space Constrained Contrastive Visual Forgetting for MLLM Unlearning
cs.AI 2026-05 unverdicted novelty 6.0

A contrastive visual forgetting technique constrained to the null space of retained knowledge enables targeted unlearning of visual concepts in MLLMs while preserving non-target visual and all textual knowledge.
Evaluation without Generation: Non-Generative Assessment of Harmful Model Specialization with Applications to CSAM
cs.LG 2026-04 unverdicted novelty 6.0

Gaussian probing infers harmful model specialization from parameter perturbations and internal representation responses to Gaussian latent ensembles rather than from generated outputs.
IPRU: Input-Perturbation-based Radio Frequency Fingerprinting Unlearning for LAWNs
eess.SP 2026-04 unverdicted novelty 6.0

IPRU erases target AAV radio fingerprints via an optimized input perturbation vector, delivering 1.41% unlearning accuracy, 99.41% remaining accuracy, full membership-inference resistance, and 5.79X speedup over retraining.
Beyond Text Prompts: Precise Concept Erasure through Text-Image Collaboration
cs.CV 2026-04 unverdicted novelty 6.0

TICoE achieves more precise and faithful concept erasure in text-to-image models by collaborating text and image data through a convex manifold and hierarchical learning, outperforming prior methods.
Class Unlearning via Depth-Aware Removal of Forget-Specific Directions
cs.CV 2026-04 unverdicted novelty 6.0

DAMP performs one-shot class unlearning by extracting and projecting out forget-specific residual directions at each network depth using class prototypes and a separability-derived scaling rule.
BID-LoRA: A Parameter-Efficient Framework for Continual Learning and Unlearning
cs.LG 2026-04 unverdicted novelty 6.0

BID-LoRA uses bi-directional low-rank adapters with retain/new/unlearn pathways and escape unlearning to enable continual learning and unlearning while minimizing knowledge leakage and parameter updates.
EGLOCE: Training-Free Energy-Guided Latent Optimization for Concept Erasure
cs.CV 2026-04 unverdicted novelty 6.0

EGLOCE erases target concepts in diffusion models at inference time by optimizing latents with dual energy guidance that repels unwanted concepts while retaining prompt alignment.
Bias Redistribution in Visual Machine Unlearning: Does Forgetting One Group Harm Another?
cs.LG 2026-04 unverdicted novelty 6.0

Unlearning a demographic group in CLIP models redistributes bias primarily along gender boundaries rather than eliminating it.
Erasure or Erosion? Evaluating Compositional Degradation in Unlearned Text-To-Image Diffusion Models
cs.CV 2026-04 unverdicted novelty 6.0

Unlearning methods that strongly erase concepts from text-to-image diffusion models consistently degrade performance on attribute binding, spatial reasoning, and counting tasks.
Jellyfish: Zero-Shot Federated Unlearning Scheme with Knowledge Disentanglement
cs.CR 2026-04 unverdicted novelty 6.0

Jellyfish enables zero-shot federated unlearning through synthetic proxy data generation, channel-restricted knowledge disentanglement, and a composite loss with repair to forget target data while retaining model utility.
Machine Unlearning for Class Removal through SISA-based Deep Neural Network Architectures
cs.CV 2026-04 unverdicted novelty 5.0

A modified SISA architecture with replay and gating achieves effective class removal from trained CNNs on image datasets while preserving accuracy and cutting retraining costs.

Reference graph

Works this paper leans on

206 extracted references · 206 canonical work pages · cited by 17 Pith papers · 17 internal anchors

[1]

Sanity checks for saliency maps

Julius Adebayo, Justin Gilmer, Michael Muelly, Ian Goodfellow, Moritz Hardt, and Been Kim. Sanity checks for saliency maps. Advances in neural information processing systems, 31, 2018

work page 2018
[2]

Gradient surgery for one-shot unlearning on generative model, 2023

Seohui Bae, Seoyoon Kim, Hyemin Jung, and Woohyung Lim. Gradient surgery for one-shot unlearning on generative model, 2023

work page 2023
[4]

Nudenet: Neural nets for nudity classification, detection and selective censoring, 2019

P Bedapudi. Nudenet: Neural nets for nudity classification, detection and selective censoring, 2019

work page 2019
[6]

Membership inference attacks from first principles

Nicholas Carlini, Steve Chien, Milad Nasr, Shuang Song, Andreas Terzis, and Florian Tramer. Membership inference attacks from first principles. In 2022 IEEE Symposium on Security and Privacy (SP), pp.\ 1897--1914. IEEE, 2022

work page 2022
[7]

Grad- CAM ++: Generalized gradient-based visual explanations for deep convolutional networks

Aditya Chattopadhay, Anirban Sarkar, Prantik Howlader, and Vineeth N Balasubramanian. Grad- CAM ++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp.\ 839--847. IEEE, 2018

work page 2018
[8]

Graph unlearning

Min Chen, Zhikun Zhang, Tianhao Wang, Michael Backes, Mathias Humbert, and Yang Zhang. Graph unlearning. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, pp.\ 499--513, 2022 a

work page 2022
[9]

Boundary unlearning: Rapid forgetting of deep networks via shifting the decision boundary

Min Chen, Weizhuo Gao, Gaoyang Liu, Kai Peng, and Chen Wang. Boundary unlearning: Rapid forgetting of deep networks via shifting the decision boundary. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.\ 7766--7775, 2023

work page 2023
[10]

Quarantine: Sparsity can uncover the trojan attack trigger for free

Tianlong Chen, Zhenyu Zhang, Yihua Zhang, Shiyu Chang, Sijia Liu, and Zhangyang Wang. Quarantine: Sparsity can uncover the trojan attack trigger for free. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.\ 598--609, 2022 b

work page 2022
[15]

Our data, ourselves: Privacy via distributed noise generation

Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: Privacy via distributed noise generation. In Annual international conference on the theory and applications of cryptographic techniques, pp.\ 486--503. Springer, 2006

work page 2006
[18]

Making ai forget you: Data deletion in machine learning

Antonio Ginart, Melody Guan, Gregory Valiant, and James Y Zou. Making ai forget you: Data deletion in machine learning. Advances in neural information processing systems, 32, 2019

work page 2019
[19]

Eternal sunshine of the spotless net: Selective forgetting in deep networks

Aditya Golatkar, Alessandro Achille, and Stefano Soatto. Eternal sunshine of the spotless net: Selective forgetting in deep networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.\ 9304--9312, 2020

work page 2020
[20]

Amnesiac machine learning

Laura Graves, Vineel Nagisetty, and Vijay Ganesh. Amnesiac machine learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pp.\ 11516--11524, 2021

work page 2021
[24]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.\ 770--778, 2016

work page 2016
[25]

Selective amnesia: A continual learning approach to forgetting in deep generative models, 2023

Alvin Heng and Harold Soh. Selective amnesia: A continual learning approach to forgetting in deep generative models, 2023

work page 2023
[27]

The european union general data protection regulation: what it is and what it means

Chris Jay Hoofnagle, Bart van der Sloot, and Frederik Zuiderveen Borgesius. The european union general data protection regulation: what it is and what it means. Information & Communications Technology Law, 28 0 (1): 0 65--98, 2019

work page 2019
[28]

Fastai: A layered api for deep learning

Jeremy Howard and Sylvain Gugger. Fastai: A layered api for deep learning. Information, 11 0 (2): 0 108, 2020

work page 2020
[30]

Approximate data deletion from machine learning models

Zachary Izzo, Mary Anne Smart, Kamalika Chaudhuri, and James Zou. Approximate data deletion from machine learning models. In International Conference on Artificial Intelligence and Statistics, pp.\ 2008--2016. PMLR, 2021

work page 2008
[31]

A data-based perspective on transfer learning

Saachi Jain, Hadi Salman, Alaa Khaddaj, Eric Wong, Sung Min Park, and Aleksander M a dry. A data-based perspective on transfer learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.\ 3613--3622, 2023

work page 2023
[32]

How can i explain this to you? an empirical study of deep neural network explanation methods

Jeya Vikranth Jeyakumar, Joseph Noor, Yu-Hsi Cheng, Luis Garcia, and Mani Srivastava. How can i explain this to you? an empirical study of deep neural network explanation methods. Advances in Neural Information Processing Systems, 33: 0 4211--4222, 2020

work page 2020
[34]

Understanding black-box predictions via influence functions

Pang Wei Koh and Percy Liang. Understanding black-box predictions via influence functions. In International conference on machine learning, pp.\ 1885--1894. PMLR, 2017

work page 2017
[35]

Learning multiple layers of features from tiny images

Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009

work page 2009
[36]

Tiny imagenet visual recognition challenge

Ya Le and Xuan Yang. Tiny imagenet visual recognition challenge. CS 231N, 7 0 (7): 0 3, 2015

work page 2015
[39]

Swin transformer: Hierarchical vision transformer using shifted windows

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pp.\ 10012--10022, 2021

work page 2021
[40]

Locating and editing factual associations in gpt

Kevin Meng, David Bau, Alex Andonian, and Yonatan Belinkov. Locating and editing factual associations in gpt. Advances in Neural Information Processing Systems, 35: 0 17359--17372, 2022

work page 2022
[42]

Descent-to-delete: Gradient-based methods for machine unlearning

Seth Neel, Aaron Roth, and Saeed Sharifi-Malvajerdi. Descent-to-delete: Gradient-based methods for machine unlearning. In Algorithmic Learning Theory, pp.\ 931--962. PMLR, 2021

work page 2021
[43]

Reading digits in natural images with unsupervised feature learning

Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. Reading digits in natural images with unsupervised feature learning. 2011

work page 2011
[45]

Proximal algorithms

Neal Parikh, Stephen Boyd, et al. Proximal algorithms. Foundations and trends in Optimization , 1 0 (3): 0 127--239, 2014

work page 2014
[49]

Scaling vision with sparse mixture of experts

Carlos Riquelme, Joan Puigcerver, Basil Mustafa, Maxim Neumann, Rodolphe Jenatton, Andr \'e Susano Pinto, Daniel Keysers, and Neil Houlsby. Scaling vision with sparse mixture of experts. Advances in Neural Information Processing Systems, 34: 0 8583--8595, 2021

work page 2021
[50]

High-resolution image synthesis with latent diffusion models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj \"o rn Ommer. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.\ 10684--10695, 2022

work page 2022
[51]

Safe latent diffusion: Mitigating inappropriate degeneration in diffusion models, 2023

Patrick Schramowski, Manuel Brack, Björn Deiseroth, and Kristian Kersting. Safe latent diffusion: Mitigating inappropriate degeneration in diffusion models, 2023

work page 2023
[53]

Laion-5b: An open large-scale dataset for training next generation image-text models

Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, et al. Laion-5b: An open large-scale dataset for training next generation image-text models. Advances in Neural Information Processing Systems, 35: 0 25278--25294, 2022

work page 2022
[54]

Remember what you want to forget: Algorithms for machine unlearning

Ayush Sekhari, Jayadev Acharya, Gautam Kamath, and Ananda Theertha Suresh. Remember what you want to forget: Algorithms for machine unlearning. Advances in Neural Information Processing Systems, 34: 0 18075--18086, 2021

work page 2021
[55]

Grad- CAM : Visual explanations from deep networks via gradient-based localization

Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. Grad- CAM : Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision, pp.\ 618--626, 2017

work page 2017
[60]

Diffusion art or digital forgery? investigating data replication in diffusion models

Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, and Tom Goldstein. Diffusion art or digital forgery? investigating data replication in diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.\ 6048--6058, 2023

work page 2023
[62]

Axiomatic attribution for deep networks

Mukund Sundararajan, Ankur Taly, and Qiqi Yan. Axiomatic attribution for deep networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp.\ 3319--3328. JMLR. org, 2017

work page 2017
[63]

Unrolling sgd: Understanding factors influencing machine unlearning

Anvith Thudi, Gabriel Deza, Varun Chandrasekaran, and Nicolas Papernot. Unrolling sgd: Understanding factors influencing machine unlearning. In 2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P), pp.\ 303--319. IEEE, 2022 a

work page 2022
[64]

On the necessity of auditable algorithmic definitions for machine unlearning

Anvith Thudi, Hengrui Jia, Ilia Shumailov, and Nicolas Papernot. On the necessity of auditable algorithmic definitions for machine unlearning. In 31st USENIX Security Symposium (USENIX Security 22), pp.\ 4007--4022, 2022 b

work page 2022
[65]

Machine unlearning via algorithmic stability

Enayat Ullah, Tung Mai, Anup Rao, Ryan A Rossi, and Raman Arora. Machine unlearning via algorithmic stability. In Conference on Learning Theory, pp.\ 4126--4142. PMLR, 2021

work page 2021
[66]

Federated unlearning via class-discriminative pruning

Junxiao Wang, Song Guo, Xin Xie, and Heng Qi. Federated unlearning via class-discriminative pruning. In Proceedings of the ACM Web Conference 2022, pp.\ 622--632, 2022

work page 2022
[68]

Leveraging sparse linear layers for debuggable deep networks

Eric Wong, Shibani Santurkar, and Aleksander Madry. Leveraging sparse linear layers for debuggable deep networks. In International Conference on Machine Learning, pp.\ 11205--11216. PMLR, 2021

work page 2021
[69]

Federated unlearning: Guarantee the right of clients to forget

Leijie Wu, Song Guo, Junxiao Wang, Zicong Hong, Jie Zhang, and Yaohong Ding. Federated unlearning: Guarantee the right of clients to forget. IEEE Network, 36 0 (5): 0 129--135, 2022

work page 2022
[71]

Visualizing and understanding convolutional networks

Matthew D Zeiler and Rob Fergus. Visualizing and understanding convolutional networks. In European conference on computer vision, pp.\ 818--833. Springer, 2014

work page 2014
[74]

B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.\ 2921--2929, 2016

work page 2016
[75]

2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P) , pages=

Unrolling sgd: Understanding factors influencing machine unlearning , author=. 2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P) , pages=. 2022 , organization=

work page 2022
[76]

Can sensitive information be deleted from llms? objectives for defending against extraction attacks

Can sensitive information be deleted from llms? objectives for defending against extraction attacks , author=. arXiv preprint arXiv:2309.17410 , year=

work page arXiv
[77]

International Conference on Machine Learning , pages=

Understanding instance-level impact of fairness constraints , author=. International Conference on Machine Learning , pages=. 2022 , organization=

work page 2022
[78]

Canadian privacy law: The personal information protection and electronic documents act (PIPEDA) , author=. Int'l. In-House Counsel J. , volume=. 2008 , publisher=

work page 2008
[79]

arXiv preprint arXiv:2301.09753 , year=

Towards Modular Machine Learning Solution Development: Benefits and Trade-offs , author=. arXiv preprint arXiv:2301.09753 , year=

work page arXiv
[80]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Quarantine: Sparsity can uncover the trojan attack trigger for free , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page
[81]

Advances in neural information processing systems , volume=

Sanity checks for saliency maps , author=. Advances in neural information processing systems , volume=

work page
[82]

SmoothGrad: removing noise by adding noise

Smoothgrad: removing noise by adding noise , author=. arXiv preprint arXiv:1706.03825 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[83]

International Conference on Machine Learning , pages=

Leveraging sparse linear layers for debuggable deep networks , author=. International Conference on Machine Learning , pages=. 2021 , organization=

work page 2021
[84]

Advances in Neural Information Processing Systems , volume=

Prompt certified machine unlearning with randomized gradient smoothing and quantization , author=. Advances in Neural Information Processing Systems , volume=

work page
[85]

European conference on computer vision , pages=

Visualizing and understanding convolutional networks , author=. European conference on computer vision , pages=. 2014 , organization=

work page 2014
[86]

Striving for Simplicity: The All Convolutional Net

Striving for simplicity: The all convolutional net , author=. arXiv preprint arXiv:1412.6806 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[87]

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

Deep inside convolutional networks: Visualising image classification models and saliency maps , author=. arXiv preprint arXiv:1312.6034 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[88]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Diffusion art or digital forgery? investigating data replication in diffusion models , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page
[89]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages=

Learning deep features for discriminative localization , author=. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages=

work page
[90]

RISE: Randomized Input Sampling for Explanation of Black-box Models

RISE: Randomized Input Sampling for Explanation of Black-box Models , author=. arXiv preprint arXiv:1806.07421 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[91]

arXiv preprint arXiv:2308.03296 , year=

Studying Large Language Model Generalization with Influence Functions , author=. arXiv preprint arXiv:2308.03296 , year=

work page arXiv
[92]

arXiv preprint arXiv:2302.03169 , year=

Data selection for language models via importance resampling , author=. arXiv preprint arXiv:2302.03169 , year=

work page arXiv
[93]

Advances in Neural Information Processing Systems , volume=

How can i explain this to you? an empirical study of deep neural network explanation methods , author=. Advances in Neural Information Processing Systems , volume=

work page
[94]

arXiv preprint arXiv:2202.00622 , year=

Datamodels: Predicting predictions from training data , author=. arXiv preprint arXiv:2202.00622 , year=

work page arXiv
[95]

Chattopadhay, Aditya and Sarkar, Anirban and Howlader, Prantik and Balasubramanian, Vineeth N , booktitle=. Grad-. 2018 , organization=

work page 2018
[96]

arXiv preprint arXiv:2303.14186 , year=

Trak: Attributing model behavior at scale , author=. arXiv preprint arXiv:2303.14186 , year=

work page arXiv
[97]

Selvaraju, Ramprasaath R and Cogswell, Michael and Das, Abhishek and Vedantam, Ramakrishna and Parikh, Devi and Batra, Dhruv , booktitle=. Grad-

work page
[98]

Proceedings of the 34th International Conference on Machine Learning-Volume 70 , pages=

Axiomatic attribution for deep networks , author=. Proceedings of the 34th International Conference on Machine Learning-Volume 70 , pages=. 2017 , organization=

work page 2017
[99]

2022 IEEE Symposium on Security and Privacy (SP) , pages=

Membership inference attacks from first principles , author=. 2022 IEEE Symposium on Security and Privacy (SP) , pages=. 2022 , organization=

work page 2022
[100]

2022 IEEE International Conference on Knowledge Graph (ICKG) , pages=

Certified Data Removal in Sum-Product Networks , author=. 2022 IEEE International Conference on Knowledge Graph (ICKG) , pages=. 2022 , organization=

work page 2022
[101]

IEEE Network , volume=

Federated unlearning: Guarantee the right of clients to forget , author=. IEEE Network , volume=. 2022 , publisher=

work page 2022
[102]

arXiv preprint arXiv:2307.14754 , year=

Fair Machine Unlearning: Data Removal while Mitigating Disparities , author=. arXiv preprint arXiv:2307.14754 , year=

work page arXiv
[103]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

A Data-Based Perspective on Transfer Learning , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page
[104]

Advances in Neural Information Processing Systems , volume=

Scaling vision with sparse mixture of experts , author=. Advances in Neural Information Processing Systems , volume=

work page
[105]

International conference on machine learning , pages=

Understanding black-box predictions via influence functions , author=. International conference on machine learning , pages=. 2017 , organization=

work page 2017
[106]

IEEE Transactions on Pattern Analysis and Machine Intelligence , year=

Dataset security for machine learning: Data poisoning, backdoor attacks, and defenses , author=. IEEE Transactions on Pattern Analysis and Machine Intelligence , year=

work page
[107]

International Conference on Artificial Intelligence and Statistics , pages=

Approximate data deletion from machine learning models , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2021 , organization=

work page 2021
[108]

Neural Networks , volume=

Continual lifelong learning with neural networks: A review , author=. Neural Networks , volume=. 2019 , publisher=

work page 2019
[109]

1982 , publisher=

Residuals and influence in regression , author=. 1982 , publisher=

work page 1982
[110]

Foundations and Trends

Optimization with sparsity-inducing penalties , author=. Foundations and Trends. 2012 , publisher=

work page 2012
[111]

Advances in Neural Information Processing Systems , year=

Fair Infinitesimal Jackknife: Mitigating the Influence of Biased Training Data Points Without Refitting , author=. Advances in Neural Information Processing Systems , year=

work page

Showing first 80 references.