Disentangled Sparse Representations for Concept-Separated Diffusion Unlearning

· 2026 · cs.LG · arXiv 2605.12122

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Unlearning specific concepts in text-to-image diffusion models has become increasingly important for preventing undesirable content generation. Among prior approaches, sparse autoencoder (SAE)-based methods have attracted attention due to their ability to suppress target concepts through lightweight manipulation of latent features, without modifying model parameters. However, SAEs trained with sparse reconstruction objectives do not explicitly enforce concept-wise separation, resulting in shared latent features across concepts. To address this, we propose SAEParate, which organizes latent representations into concept-specific clusters via a concept-aware contrastive objective, enabling more precise concept suppression while reducing unintended interference during unlearning. In addition, we enhance the encoder with a GeLU-based nonlinear transformation to increase its expressive capacity under this separation objective, enabling a more discriminative and disentangled latent space. Experiments on UnlearnCanvas demonstrate state-of-the-art performance, with particularly strong gains in joint style-object unlearning, a challenging setting where existing methods suffer from severe interference between target and non-target concepts.

representative citing papers

Mosaic: Compositional Multi-Concept Erasure via Vector Field Blending

cs.CV · 2026-05-25 · unverdicted · novelty 7.0

Mosaic is a framework for compositional multi-concept erasure in flow-based T2I models via spatial vector field blending without extra optimization, evaluated on the new CoME-Bench benchmark covering intra- and cross-category cases.

citing papers explorer

Showing 1 of 1 citing paper.

Mosaic: Compositional Multi-Concept Erasure via Vector Field Blending cs.CV · 2026-05-25 · unverdicted · none · ref 35 · internal anchor
Mosaic is a framework for compositional multi-concept erasure in flow-based T2I models via spatial vector field blending without extra optimization, evaluated on the new CoME-Bench benchmark covering intra- and cross-category cases.

Disentangled Sparse Representations for Concept-Separated Diffusion Unlearning

fields

years

verdicts

representative citing papers

citing papers explorer