Accelerated sampling from masked diffusion models via entropy bounded unmasking.arXiv preprint arXiv:2505.24857

Heli Ben-Hamu, Itai Gat, Daniel Severo, Niklas Nolte, Brian Karrer · 2025 · arXiv 2505.24857

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

LEAP: Unlocking dLLM Parallelism via Lookahead Early-Convergence Token Detection

cs.LG · 2026-05-09 · unverdicted · novelty 7.0

LEAP detects early-converging tokens in dLLMs via future context filtering and multi-sequence superposition, reducing average denoising steps by about 30% while maintaining accuracy.

Stability-Weighted Decoding for Diffusion Language Models

cs.CL · 2026-04-18 · unverdicted · novelty 6.0

Stability-Weighted Decoding improves diffusion LLM accuracy by modulating token scores with temporal stability from KL divergence between prediction steps.

Differences in Text Generated by Diffusion and Autoregressive Language Models

cs.CL · 2026-04-04 · unverdicted · novelty 6.0

DLMs exhibit lower n-gram entropy, higher semantic coherence, and higher semantic diversity than ARMs, primarily due to bidirectional context and remasking decoding strategies.

DMax: Aggressive Parallel Decoding for dLLMs

cs.LG · 2026-04-09 · unverdicted · novelty 5.0

DMax enables faster parallel decoding in diffusion language models by using on-policy training to recover from errors and soft embedding interpolations for iterative revision, boosting tokens per forward pass roughly 2-3x on benchmarks while preserving accuracy.

citing papers explorer

Showing 4 of 4 citing papers after filters.

LEAP: Unlocking dLLM Parallelism via Lookahead Early-Convergence Token Detection cs.LG · 2026-05-09 · unverdicted · none · ref 3
LEAP detects early-converging tokens in dLLMs via future context filtering and multi-sequence superposition, reducing average denoising steps by about 30% while maintaining accuracy.
Stability-Weighted Decoding for Diffusion Language Models cs.CL · 2026-04-18 · unverdicted · none · ref 2
Stability-Weighted Decoding improves diffusion LLM accuracy by modulating token scores with temporal stability from KL divergence between prediction steps.
Differences in Text Generated by Diffusion and Autoregressive Language Models cs.CL · 2026-04-04 · unverdicted · none · ref 3
DLMs exhibit lower n-gram entropy, higher semantic coherence, and higher semantic diversity than ARMs, primarily due to bidirectional context and remasking decoding strategies.
DMax: Aggressive Parallel Decoding for dLLMs cs.LG · 2026-04-09 · unverdicted · none · ref 8
DMax enables faster parallel decoding in diffusion language models by using on-policy training to recover from errors and soft embedding interpolations for iterative revision, boosting tokens per forward pass roughly 2-3x on benchmarks while preserving accuracy.

Accelerated sampling from masked diffusion models via entropy bounded unmasking.arXiv preprint arXiv:2505.24857

fields

years

verdicts

representative citing papers

citing papers explorer