Fine-Tuning Masked Diffusion for Provable Self-Correction

David Z. Pan; Hyeji Kim; Jaeyeon Kim; Seunggeun Kim; Sham Kakade; Sitan Chen; Taekyun Lee

arxiv: 2510.01384 · v4 · pith:JLTXCAQUnew · submitted 2025-10-01 · 💻 cs.LG

Fine-Tuning Masked Diffusion for Provable Self-Correction

Jaeyeon Kim , Seunggeun Kim , Taekyun Lee , David Z. Pan , Hyeji Kim , Sham Kakade , Sitan Chen This is my paper

classification 💻 cs.LG

keywords self-correctionmaskedqualityapproachdiffusiongenerativeinferencelow-quality

0 comments

read the original abstract

A natural desideratum for generative models is self-correction--detecting and revising low-quality tokens at inference. While Masked Diffusion Models (MDMs) have emerged as a promising approach for generative modeling in discrete spaces, their capacity for self-correction remains poorly understood. Prior attempts to incorporate self-correction into MDMs either require overhauling MDM architectures/training or rely on imprecise proxies for token quality, limiting their applicability. Motivated by this, we introduce PRISM--Plug-in Remasking for Inference-time Self-correction of Masked Diffusions--a lightweight, model-agnostic approach that applies to any pretrained MDM. Theoretically, PRISM defines a self-correction loss that provably learns per-token quality scores, without RL or a verifier. These quality scores are computed in the same forward pass with MDM and used to detect low-quality tokens. Empirically, PRISM advances MDM inference across domains and scales: Sudoku; unconditional text (170M); and code with LLaDA (8B).

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Inference-Time Scaling in Diffusion Models through Iterative Partial Refinement
cs.LG 2026-05 unverdicted novelty 7.0

IPR improves valid solution rates on MNIST Sudoku from 55.8% to 75.0% by iteratively refining partial regions in sequential diffusion models without external verifiers or reward models.
Remask, Don't Replace: Token-to-Mask Refinement in Diffusion Large Language Models
cs.CL 2026-04 unverdicted novelty 7.0

Token-to-Mask remasking improves self-correction in diffusion LLMs by resetting erroneous commitments to masks rather than overwriting them, yielding +13.33 points on AIME 2025 and +8.56 on CMATH.
Locally Coherent Parallel Decoding in Diffusion Language Models
cs.CL 2026-03 unverdicted novelty 7.0

CoDiLA adds a compact auxiliary AR model on diffusion latents to enforce local sequential validity during parallel token sampling in discrete diffusion language models.
Discrete Stochastic Localization for Non-autoregressive Generation
cs.LG 2026-02 unverdicted novelty 7.0

Discrete Stochastic Localization lets a single trained network support an entire family of per-token SNR paths for discrete sequence generation, with masked diffusion as a special case, and improves MAUVE scores when ...