LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
NI Sampling accelerates discrete diffusion language models up to 14.3 times by training a neural indicator to select which tokens to sample at each step using a trajectory-preserving objective.
In diffusion language models, coarse linguistic labels stabilize earlier than exact token identity, uncertainty tracks correctness, and mid-trajectory states are most sensitive to perturbations.
citing papers explorer
-
Large Language Diffusion Models
LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.
-
NI Sampling: Accelerating Discrete Diffusion Sampling by Token Order Optimization
NI Sampling accelerates discrete diffusion language models up to 14.3 times by training a neural indicator to select which tokens to sample at each step using a trajectory-preserving objective.
-
Measuring Temporal Linguistic Emergence in Diffusion Language Models
In diffusion language models, coarse linguistic labels stabilize earlier than exact token identity, uncertainty tracks correctness, and mid-trajectory states are most sensitive to perturbations.