Obliviate erases targeted concepts from autoregressive image generators via KL supervision on visual tokens over full trajectories, cutting nudity rates sharply on benchmarks while keeping general performance.
Mma-diffusion: Multimodal attack on diffusion models
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 8verdicts
UNVERDICTED 8roles
method 1polarities
use method 1representative citing papers
Introduces a layered intervention framework for knowledge infusion in multimodal generative models and empirically demonstrates complementarity of layers in a safety-alignment task with diffusion models.
LA-LQR applies latent-space linear-quadratic regulator control to steer text-to-video model activations toward desired features while penalizing excessive changes.
A method using attention head vectors detects and suppresses risky content generation in Diffusion Transformers at inference time.
FlowGuard detects unsafe content during diffusion image generation via linear latent decoding and curriculum learning, outperforming prior methods by over 30% F1 while reducing GPU memory by 97% and projection time to 0.2 seconds.
EGLOCE erases target concepts in diffusion models at inference time by optimizing latents with dual energy guidance that repels unwanted concepts while retaining prompt alignment.
Unlearning methods that strongly erase concepts from text-to-image diffusion models consistently degrade performance on attribute binding, spatial reasoning, and counting tasks.
SPOT projects prompts to a tau-safe set via total variation to cut inappropriate content 14-44% relative to baselines while preserving benign prompt behavior in frozen T2I models.
citing papers explorer
No citing papers match the current filters.