arXiv preprint arXiv:2404.15766 , year =

Kaiwen Xue, Yuhao Zhou, Shen Nie, Xu Min, Xiaolu Zhang, Jun Zhou, Chongxuan Li · 2024 · arXiv 2404.15766

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

Large Language Diffusion Models

cs.CL · 2025-02-14 · unverdicted · novelty 8.0

LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.

Observation-Aligned Mask Priors for Learning Physical Dynamics from Authentic Occlusions

cs.CV · 2026-05-16 · unverdicted · novelty 7.0

A framework pretrained on authentic binary occlusion masks uses guided sampling and intersection-based partitioning to train diffusion models on incomplete physical observations without zero-query regions.

Discrete Bayesian Sample Inference for Graph Generation

cs.LG · 2025-11-04 · unverdicted · novelty 6.0

GraphBSI uses Bayesian Sample Inference as noise-controlled SDEs to generate discrete graphs in one shot, achieving state-of-the-art results on molecular benchmarks Moses and GuacaMol.

LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning

cs.LG · 2025-05-22 · conditional · novelty 6.0

LLaDA-V is a diffusion-based multimodal large language model that reaches competitive or state-of-the-art results on visual instruction tasks while using a non-autoregressive architecture.

Efficient Long-Context Modeling in Diffusion Language Models via Block Approximate Sparse Attention

cs.CV · 2026-05-19 · unverdicted · novelty 5.0

BA-Att introduces pre-downsampled block selection with norm-sorting and diagonal covariance correction to approximate sparse attention, yielding up to 6.95x speedup at 50% sparsity across language, multimodal, and video models.

Incomplete Data, Complete Dynamics: A Diffusion Approach

cs.LG · 2025-09-24 · unverdicted · novelty 5.0

A conditional diffusion model trained on partitioned incomplete samples for physical dynamics achieves asymptotic convergence to the true generative process under mild conditions and outperforms baselines in imputation.

A Unified Measure-Theoretic View of Diffusion, Score-Based, and Flow Matching Generative Models

cs.LG · 2026-05-07 · unverdicted · novelty 4.0

Diffusion, score-based, and flow matching models are unified as instances of learning time-dependent vector fields inducing marginal distributions governed by continuity and Fokker-Planck equations.

citing papers explorer

Showing 7 of 7 citing papers.

Large Language Diffusion Models cs.CL · 2025-02-14 · unverdicted · none · ref 56
LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.
Observation-Aligned Mask Priors for Learning Physical Dynamics from Authentic Occlusions cs.CV · 2026-05-16 · unverdicted · none · ref 24
A framework pretrained on authentic binary occlusion masks uses guided sampling and intersection-based partitioning to train diffusion models on incomplete physical observations without zero-query regions.
Discrete Bayesian Sample Inference for Graph Generation cs.LG · 2025-11-04 · unverdicted · none · ref 38
GraphBSI uses Bayesian Sample Inference as noise-controlled SDEs to generate discrete graphs in one shot, achieving state-of-the-art results on molecular benchmarks Moses and GuacaMol.
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning cs.LG · 2025-05-22 · conditional · none · ref 99
LLaDA-V is a diffusion-based multimodal large language model that reaches competitive or state-of-the-art results on visual instruction tasks while using a non-autoregressive architecture.
Efficient Long-Context Modeling in Diffusion Language Models via Block Approximate Sparse Attention cs.CV · 2026-05-19 · unverdicted · none · ref 49
BA-Att introduces pre-downsampled block selection with norm-sorting and diagonal covariance correction to approximate sparse attention, yielding up to 6.95x speedup at 50% sparsity across language, multimodal, and video models.
Incomplete Data, Complete Dynamics: A Diffusion Approach cs.LG · 2025-09-24 · unverdicted · none · ref 46
A conditional diffusion model trained on partitioned incomplete samples for physical dynamics achieves asymptotic convergence to the true generative process under mild conditions and outperforms baselines in imputation.
A Unified Measure-Theoretic View of Diffusion, Score-Based, and Flow Matching Generative Models cs.LG · 2026-05-07 · unverdicted · none · ref 51
Diffusion, score-based, and flow matching models are unified as instances of learning time-dependent vector fields inducing marginal distributions governed by continuity and Fokker-Planck equations.

arXiv preprint arXiv:2404.15766 , year =

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer