Chained rewrites by open-weight LLMs reduce watermark detection on diffusion LM outputs from 87.9% to 4.86% after five steps across multiple styles and models.
11 A Additional Results B Metric Definitions Watermark signal drop is computed as WSDˆg(a, h) = 1 N NX i=1 h ˆg(x(i) 0 )−ˆg(y(i) a,h) i
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Chainwash: Multi-Step Rewriting Attacks on Diffusion Language Model Watermarks
Chained rewrites by open-weight LLMs reduce watermark detection on diffusion LM outputs from 87.9% to 4.86% after five steps across multiple styles and models.