ΔLPS is a gradient-guided discrete posterior sampler for inverse problems that works with masked or uniform discrete diffusion priors and outperforms prior discrete methods on image restoration tasks.
Lavida: A large diffusion language model for multimodal understanding
7 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
GPO-V jailbreaks dVLMs by globally optimizing probabilities in the denoising process to bypass refusal patterns, achieving stealthy and transferable attacks.
BARD bridges autoregressive and diffusion VLMs with progressive block merging plus stage-wise intra-diffusion distillation, delivering 3x speedup and new SOTA on open dVLMs using under 4.4M data points.
A discrete diffusion model tokenizes multimodal robotic data and uses a progress token to predict future states and task completion for scalable policy evaluation.
LLaDA2.0-Uni unifies multimodal understanding and generation inside one discrete diffusion large language model with a semantic tokenizer, MoE backbone, and diffusion decoder.
Position and step penalty plus visual reasoning guidance fix premature answering and weak visual grounding in diffusion MLLMs, delivering up to 7.5% accuracy gains and over 3x speedup.
LLaDA2.0 scales discrete diffusion language models to 100B parameters via systematic conversion from autoregressive models using a 3-phase WSD training scheme and releases open-source 16B and 100B MoE variants.
citing papers explorer
-
Discrete Langevin-Inspired Posterior Sampling
ΔLPS is a gradient-guided discrete posterior sampler for inverse problems that works with masked or uniform discrete diffusion priors and outperforms prior discrete methods on image restoration tasks.
-
LLaDA2.0: Scaling Up Diffusion Language Models to 100B
LLaDA2.0 scales discrete diffusion language models to 100B parameters via systematic conversion from autoregressive models using a 3-phase WSD training scheme and releases open-source 16B and 100B MoE variants.