Simple and effective masked diffusion language models.Advances in Neural Information Processing Systems, 37:130136–130184

Subham Sahoo, Marianne Arriola, Yair Schiff, Aaron Gokaslan, Edgar Marroquin, Justin Chiu, Alexander Rush, V olodymyr Kuleshov · 2024

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

representative citing papers

Discrete Langevin-Inspired Posterior Sampling

cs.LG · 2026-05-10 · unverdicted · novelty 7.0

ΔLPS is a gradient-guided discrete posterior sampler for inverse problems that works with masked or uniform discrete diffusion priors and outperforms prior discrete methods on image restoration tasks.

Self-Distilled Trajectory-Aware Boltzmann Modeling: Bridging the Training-Inference Discrepancy in Diffusion Language Models

cs.CL · 2026-05-12 · unverdicted · novelty 6.0

TABOM models inference unmasking preferences as a Boltzmann distribution over predictive entropies and derives a ranking loss to align DLM training with observed trajectories, yielding gains in new domains and reduced catastrophic forgetting versus standard SFT.

DVD: Discrete Voxel Diffusion for 3D Generation and Editing

cs.CV · 2026-05-08 · unverdicted · novelty 6.0

DVD treats voxel occupancy as a discrete variable in a diffusion framework to generate, assess, and edit sparse 3D voxels without continuous thresholding.

OmniVoice: Towards Omnilingual Zero-Shot Text-to-Speech with Diffusion Language Models

cs.CL · 2026-04-01 · unverdicted · novelty 6.0

OmniVoice introduces a diffusion language model-style non-autoregressive TTS system that directly maps text to multi-codebook acoustic tokens, scaling zero-shot synthesis to over 600 languages with SOTA results on multilingual benchmarks using 581k hours of open data.

citing papers explorer

Showing 4 of 4 citing papers.

Discrete Langevin-Inspired Posterior Sampling cs.LG · 2026-05-10 · unverdicted · none · ref 31
ΔLPS is a gradient-guided discrete posterior sampler for inverse problems that works with masked or uniform discrete diffusion priors and outperforms prior discrete methods on image restoration tasks.
Self-Distilled Trajectory-Aware Boltzmann Modeling: Bridging the Training-Inference Discrepancy in Diffusion Language Models cs.CL · 2026-05-12 · unverdicted · none · ref 33
TABOM models inference unmasking preferences as a Boltzmann distribution over predictive entropies and derives a ranking loss to align DLM training with observed trajectories, yielding gains in new domains and reduced catastrophic forgetting versus standard SFT.
DVD: Discrete Voxel Diffusion for 3D Generation and Editing cs.CV · 2026-05-08 · unverdicted · none · ref 25
DVD treats voxel occupancy as a discrete variable in a diffusion framework to generate, assess, and edit sparse 3D voxels without continuous thresholding.
OmniVoice: Towards Omnilingual Zero-Shot Text-to-Speech with Diffusion Language Models cs.CL · 2026-04-01 · unverdicted · none · ref 22
OmniVoice introduces a diffusion language model-style non-autoregressive TTS system that directly maps text to multi-codebook acoustic tokens, scaling zero-shot synthesis to over 600 languages with SOTA results on multilingual benchmarks using 581k hours of open data.

Simple and effective masked diffusion language models.Advances in Neural Information Processing Systems, 37:130136–130184

fields

years

verdicts

representative citing papers

citing papers explorer