SafeDIG applies position-aware sparse feature transfer via SAEs in DiT models to reduce unsafe generations in target risk domains on FLUX.1 Dev and SD 3.5 while keeping source safety and quality.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Robust and Generalizable Safety Steering for Text-to-Image Diffusion Transformers
SafeDIG applies position-aware sparse feature transfer via SAEs in DiT models to reduce unsafe generations in target risk domains on FLUX.1 Dev and SD 3.5 while keeping source safety and quality.