The devil behind the mask: An emergent safety vulnerability of diffusion llms

Zichen Wen, Jiashu Qu, Zhaorun Chen, Xiaoya Lu, Dongrui Liu, Zhiyuan Liu, Ruixi Wu, Yicun Yang, Xiangqi Jin, Haoyun Xu, et al · 2025 · arXiv 2507.11097

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

representative citing papers

BadDLM: Backdooring Diffusion Language Models with Diverse Targets

cs.CR · 2026-05-10 · unverdicted · novelty 7.0

BadDLM implants effective backdoors in diffusion language models across concept, attribute, alignment, and payload targets by exploiting denoising dynamics while preserving clean performance.

GPO-V: Jailbreak Diffusion Vision Language Model by Global Probability Optimization

cs.CV · 2026-05-08 · unverdicted · novelty 7.0 · 2 refs

GPO-V jailbreaks dVLMs by globally optimizing probabilities in the denoising process to bypass refusal patterns, achieving stealthy and transferable attacks.

Differences in Text Generated by Diffusion and Autoregressive Language Models

cs.CL · 2026-04-04 · unverdicted · novelty 6.0

DLMs exhibit lower n-gram entropy, higher semantic coherence, and higher semantic diversity than ARMs, primarily due to bidirectional context and remasking decoding strategies.

citing papers explorer

Showing 3 of 3 citing papers after filters.

BadDLM: Backdooring Diffusion Language Models with Diverse Targets cs.CR · 2026-05-10 · unverdicted · none · ref 54
BadDLM implants effective backdoors in diffusion language models across concept, attribute, alignment, and payload targets by exploiting denoising dynamics while preserving clean performance.
GPO-V: Jailbreak Diffusion Vision Language Model by Global Probability Optimization cs.CV · 2026-05-08 · unverdicted · none · ref 13 · 2 links
GPO-V jailbreaks dVLMs by globally optimizing probabilities in the denoising process to bypass refusal patterns, achieving stealthy and transferable attacks.
Differences in Text Generated by Diffusion and Autoregressive Language Models cs.CL · 2026-04-04 · unverdicted · none · ref 35
DLMs exhibit lower n-gram entropy, higher semantic coherence, and higher semantic diversity than ARMs, primarily due to bidirectional context and remasking decoding strategies.

The devil behind the mask: An emergent safety vulnerability of diffusion llms

fields

years

verdicts

representative citing papers

citing papers explorer