PIU suppresses target identity generation in Arc2Face by replacing it with a proximity-selected anchor identity through localized fine-tuning of cross-attention layers while preserving output quality for other identities.
hub
SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation
24 Pith papers cite this work. Polarity classification is still indexing.
abstract
With evolving data regulations, machine unlearning (MU) has become an important tool for fostering trust and safety in today's AI models. However, existing MU methods focusing on data and/or weight perspectives often suffer limitations in unlearning accuracy, stability, and cross-domain applicability. To address these challenges, we introduce the concept of 'weight saliency' for MU, drawing parallels with input saliency in model explanation. This innovation directs MU's attention toward specific model weights rather than the entire model, improving effectiveness and efficiency. The resultant method that we call saliency unlearning (SalUn) narrows the performance gap with 'exact' unlearning (model retraining from scratch after removing the forgetting data points). To the best of our knowledge, SalUn is the first principled MU approach that can effectively erase the influence of forgetting data, classes, or concepts in both image classification and generation tasks. As highlighted below, For example, SalUn yields a stability advantage in high-variance random data forgetting, e.g., with a 0.2% gap compared to exact unlearning on the CIFAR-10 dataset. Moreover, in preventing conditional diffusion models from generating harmful images, SalUn achieves nearly 100% unlearning accuracy, outperforming current state-of-the-art baselines like Erased Stable Diffusion and Forget-Me-Not. Codes are available at https://github.com/OPTML-Group/Unlearn-Saliency. (WARNING: This paper contains model outputs that may be offensive in nature.)
hub tools
citation-role summary
citation-polarity summary
representative citing papers
MDU minimizes forward KL divergence from prompt-conditional to prompt-masked unconditional predictions at masked positions to unlearn knowledge in MDLMs while trading off privacy and utility via temperature scaling.
Asymmetric Langevin Unlearning uses public data to suppress unlearning noise costs by O(1/n_pub²), enabling practical mass unlearning with preserved utility under distribution mismatch.
Class-level unlearning shortcuts via bias suppression in the classification head; new bias-aware training mechanisms and bias-specific metrics are introduced to diagnose and reduce this dependence.
CoVUBench is the first benchmark framework for evaluating multimodal copyright unlearning in LVLMs via synthetic data, systematic variations, and a dual protocol for forgetting efficacy and utility preservation.
The Influence Eliminating Unlearning framework maximizes relearning convergence delay via weight decay and noise injection to remove the influence of a forgetting set while preserving accuracy on retained data.
Machine unlearning conflates reversing the influence of specific training examples (untraining) with removing the full underlying distribution or behavior (unlearning).
CURE disentangles LLM recommendation circuits into forget-specific, retain-specific, and task-shared modules with tailored update rules to achieve more effective unlearning than weighted baselines.
CPC-VAR adds Gradient-based Concept Neuron Selection for continual single-concept learning and a context-aware multi-branch composition strategy to reduce forgetting and entanglement in VAR-based personalized image generation.
A contrastive visual forgetting technique constrained to the null space of retained knowledge enables targeted unlearning of visual concepts in MLLMs while preserving non-target visual and all textual knowledge.
Gaussian probing infers harmful model specialization from parameter perturbations and internal representation responses to Gaussian latent ensembles rather than from generated outputs.
IPRU erases target AAV radio fingerprints via an optimized input perturbation vector, delivering 1.41% unlearning accuracy, 99.41% remaining accuracy, full membership-inference resistance, and 5.79X speedup over retraining.
TICoE achieves more precise and faithful concept erasure in text-to-image models by collaborating text and image data through a convex manifold and hierarchical learning, outperforming prior methods.
DAMP performs one-shot class unlearning by depth-aware projection removal of forget-specific directions, producing forgetting behavior closer to retraining from scratch than prior methods on image classification tasks.
BID-LoRA uses bi-directional low-rank adapters with retain/new/unlearn pathways and escape unlearning to enable continual learning and unlearning while minimizing knowledge leakage and parameter updates.
EGLOCE erases target concepts in diffusion models at inference time by optimizing latents with dual energy guidance that repels unwanted concepts while retaining prompt alignment.
Unlearning a demographic group in CLIP models redistributes bias primarily along gender boundaries rather than eliminating it.
Unlearning methods that strongly erase concepts from text-to-image diffusion models consistently degrade performance on attribute binding, spatial reasoning, and counting tasks.
Jellyfish enables zero-shot federated unlearning through synthetic proxy data generation, channel-restricted knowledge disentanglement, and a composite loss with repair to forget target data while retaining model utility.
FIA uses contrastive concept saliency and temporal-spatial neuron identification to build unified masks that erase multiple target concepts while preserving general generation quality in diffusion models.
MCU applies mode connectivity to trace nonlinear unlearning pathways in parameter space, adds a parameter mask and adaptive penalty, and produces a range of unlearning models that plug into existing methods.
BARRIER applies interval arithmetic to SVD-based activation projections to create bounded forget regions that enable aggressive unlearning while providing formal protection for retain distributions via tail bounds on functional drift.
A modified SISA architecture with replay and gating achieves effective class removal from trained CNNs on image datasets while preserving accuracy and cutting retraining costs.
GrOCE uses dynamic semantic graphs for online, training-free erasure of target concepts from diffusion model prompts via cluster identification and selective severing.
citing papers explorer
-
PIU: Proximity-guided Identity Unlearning in ID-Conditioned Diffusion Models
PIU suppresses target identity generation in Arc2Face by replacing it with a proximity-selected anchor identity through localized fine-tuning of cross-attention layers while preserving output quality for other identities.
-
Machine Unlearning for Masked Diffusion Language Models
MDU minimizes forward KL divergence from prompt-conditional to prompt-masked unconditional predictions at masked positions to unlearn knowledge in MDLMs while trading off privacy and utility via temperature scaling.
-
Unlearning with Asymmetric Sources: Improved Unlearning-Utility Trade-off with Public Data
Asymmetric Langevin Unlearning uses public data to suppress unlearning noise costs by O(1/n_pub²), enabling practical mass unlearning with preserved utility under distribution mismatch.
-
Classification-Head Bias in Class-Level Machine Unlearning: Diagnosis, Mitigation, and Evaluation
Class-level unlearning shortcuts via bias suppression in the classification head; new bias-aware training mechanisms and bias-specific metrics are introduced to diagnose and reduce this dependence.
-
Erase Persona, Forget Lore: Benchmarking Multimodal Copyright Unlearning in Large Vision Language Models
CoVUBench is the first benchmark framework for evaluating multimodal copyright unlearning in LVLMs via synthetic data, systematic variations, and a dual protocol for forgetting efficacy and utility preservation.
-
Efficient Unlearning through Maximizing Relearning Convergence Delay
The Influence Eliminating Unlearning framework maximizes relearning convergence delay via weight decay and noise injection to remove the influence of a forgetting set while preserving accuracy on retained data.
-
Is your algorithm unlearning or untraining?
Machine unlearning conflates reversing the influence of specific training examples (untraining) with removing the full underlying distribution or behavior (unlearning).
-
CURE:Circuit-Aware Unlearning for LLM-based Recommendation
CURE disentangles LLM recommendation circuits into forget-specific, retain-specific, and task-shared modules with tailored update rules to achieve more effective unlearning than weighted baselines.
-
CPC-VAR:Continual Personalized and Compositional Generation in Visual Autoregressive Models
CPC-VAR adds Gradient-based Concept Neuron Selection for continual single-concept learning and a context-aware multi-branch composition strategy to reduce forgetting and entanglement in VAR-based personalized image generation.
-
Null Space Constrained Contrastive Visual Forgetting for MLLM Unlearning
A contrastive visual forgetting technique constrained to the null space of retained knowledge enables targeted unlearning of visual concepts in MLLMs while preserving non-target visual and all textual knowledge.
-
Evaluation without Generation: Non-Generative Assessment of Harmful Model Specialization with Applications to CSAM
Gaussian probing infers harmful model specialization from parameter perturbations and internal representation responses to Gaussian latent ensembles rather than from generated outputs.
-
IPRU: Input-Perturbation-based Radio Frequency Fingerprinting Unlearning for LAWNs
IPRU erases target AAV radio fingerprints via an optimized input perturbation vector, delivering 1.41% unlearning accuracy, 99.41% remaining accuracy, full membership-inference resistance, and 5.79X speedup over retraining.
-
Beyond Text Prompts: Precise Concept Erasure through Text-Image Collaboration
TICoE achieves more precise and faithful concept erasure in text-to-image models by collaborating text and image data through a convex manifold and hierarchical learning, outperforming prior methods.
-
Class Unlearning via Depth-Aware Removal of Forget-Specific Directions
DAMP performs one-shot class unlearning by depth-aware projection removal of forget-specific directions, producing forgetting behavior closer to retraining from scratch than prior methods on image classification tasks.
-
BID-LoRA: A Parameter-Efficient Framework for Continual Learning and Unlearning
BID-LoRA uses bi-directional low-rank adapters with retain/new/unlearn pathways and escape unlearning to enable continual learning and unlearning while minimizing knowledge leakage and parameter updates.
-
EGLOCE: Training-Free Energy-Guided Latent Optimization for Concept Erasure
EGLOCE erases target concepts in diffusion models at inference time by optimizing latents with dual energy guidance that repels unwanted concepts while retaining prompt alignment.
-
Bias Redistribution in Visual Machine Unlearning: Does Forgetting One Group Harm Another?
Unlearning a demographic group in CLIP models redistributes bias primarily along gender boundaries rather than eliminating it.
-
Erasure or Erosion? Evaluating Compositional Degradation in Unlearned Text-To-Image Diffusion Models
Unlearning methods that strongly erase concepts from text-to-image diffusion models consistently degrade performance on attribute binding, spatial reasoning, and counting tasks.
-
Jellyfish: Zero-Shot Federated Unlearning Scheme with Knowledge Disentanglement
Jellyfish enables zero-shot federated unlearning through synthetic proxy data generation, channel-restricted knowledge disentanglement, and a composite loss with repair to forget target data while retaining model utility.
-
Forget-It-All: Multi-Concept Machine Unlearning via Concept-Aware Neuron Masking
FIA uses contrastive concept saliency and temporal-spatial neuron identification to build unified masks that erase multiple target concepts while preserving general generation quality in diffusion models.
-
Exploring Nonlinear Pathway in Parameter Space for Machine Unlearning
MCU applies mode connectivity to trace nonlinear unlearning pathways in parameter space, adds a parameter mask and adaptive penalty, and produces a range of unlearning models that plug into existing methods.
-
BARRIER: Bounded Activation Regions for Robust Information Erasure
BARRIER applies interval arithmetic to SVD-based activation projections to create bounded forget regions that enable aggressive unlearning while providing formal protection for retain distributions via tail bounds on functional drift.
-
Machine Unlearning for Class Removal through SISA-based Deep Neural Network Architectures
A modified SISA architecture with replay and gating achieves effective class removal from trained CNNs on image datasets while preserving accuracy and cutting retraining costs.
-
GrOCE:Graph-Guided Online Concept Erasure for Text-to-Image Diffusion Models
GrOCE uses dynamic semantic graphs for online, training-free erasure of target concepts from diffusion model prompts via cluster identification and selective severing.