hub

A reparameterized discrete diffusion model for text generation

A reparameterized discrete diffusion model for text generation , author= · 2023 · arXiv 2302.05737

15 Pith papers cite this work. Polarity classification is still indexing.

15 Pith papers citing it

read on arXiv browse 15 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

A-CODE: Fully Atomic Protein Co-Design with Unified Multimodal Diffusion

q-bio.QM · 2026-05-05 · unverdicted · novelty 8.0

A-CODE presents a fully atomic one-stage multimodal diffusion model for protein co-design that claims superior unconditional generation performance over prior one- and two-stage models plus a tenfold success-rate gain on hard binder-design tasks.

Large Language Diffusion Models

cs.CL · 2025-02-14 · unverdicted · novelty 8.0

LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.

Uniform Diffusion Models Revisited: Leave-One-Out Denoiser and Absorbing State Reformulation

cs.LG · 2026-05-21 · unverdicted · novelty 7.0

Uniform diffusion models rely on a leave-one-out denoiser rather than the usual denoising posterior, with exact conversions derived; an absorbing-state reformulation is introduced that matches or exceeds masked diffusion on language modeling while preserving the original joint distribution.

Leveraging Pretrained Language Models as Energy Functions for Glauber Dynamics Text Diffusion

cs.LG · 2026-05-05 · unverdicted · novelty 7.0

Pretrained language models are used as energy functions for Glauber dynamics in discrete text diffusion, improving generation quality over prior diffusion LMs and matching autoregressive models on benchmarks and reasoning tasks.

Accelerating Discrete Diffusion Models with Parallel-In-Time Sampling

cs.LG · 2026-07-01 · unverdicted · novelty 6.0

A parallel-in-time τ-leaping sampler for absorbing discrete diffusion models is introduced, with an exponential-factorial convergence proof and empirical speedups of 7-9× on synthetic tasks and 1.45-1.86× on image/text tasks while using 50% fewer NFE.

Diffusion Language Model Parallel Decoding via Product-of-Experts Bridge

cs.CL · 2026-06-06 · unverdicted · novelty 6.0

PoE-Bridge uses a product-of-experts bridge between diffusion and autoregressive distributions, with DLM drafting plus rejection and importance sampling, to deliver 5x speedup over standard DLM decoding while recovering at least 95% of AR performance on math and coding tasks.

PulseCol: Periodically Refreshed Column-Sparse Attention for Accelerating Diffusion Language Models

cs.CL · 2026-05-20 · unverdicted · novelty 6.0

PulseCol introduces periodically refreshed column-sparse attention to achieve up to 1.95x speedup over FlashAttention in diffusion LLMs with maintained model quality.

BitLM: Unlocking Multi-Token Language Generation with Bitwise Continuous Diffusion

cs.CL · 2026-05-12 · unverdicted · novelty 6.0

BitLM replaces per-token softmax with bitwise continuous diffusion inside causal blocks to generate multiple tokens in parallel while preserving autoregressive structure.

Coupling Models for One-Step Discrete Generation

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

Coupling Models enable single-step discrete sequence generation via learned couplings to Gaussian latents and outperform prior one-step baselines on text perplexity, biological FBD, and image FID metrics.

Continuous Latent Diffusion Language Model

cs.CL · 2026-05-07 · unverdicted · novelty 6.0

Cola DLM proposes a hierarchical latent diffusion model that learns a text-to-latent mapping, fits a global semantic prior in continuous space with a block-causal DiT, and performs conditional decoding, establishing latent prior modeling as an alternative to token-level autoregressive language model

Towards A Generative Protein Evolution Machine with DPLM-Evo

cs.LG · 2026-04-30 · unverdicted · novelty 6.0 · 3 refs

DPLM-Evo introduces an evolutionary discrete diffusion framework with explicit edit prediction and contextual noising that claims SOTA single-sequence mutation effect prediction on ProteinGym while supporting variable-length evolution simulation.

LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning

cs.LG · 2025-05-22 · conditional · novelty 6.0

LLaDA-V is a diffusion-based multimodal large language model that reaches competitive or state-of-the-art results on visual instruction tasks while using a non-autoregressive architecture.

Efficient Long-Context Modeling in Diffusion Language Models via Block Approximate Sparse Attention

cs.CV · 2026-05-19 · unverdicted · novelty 5.0

BA-Att introduces pre-downsampled block selection with norm-sorting and diagonal covariance correction to approximate sparse attention, yielding up to 6.95x speedup at 50% sparsity across language, multimodal, and video models.

FastDiSS: Few-step Match Many-step Diffusion Language Model on Sequence-to-Sequence Generation--Full Version

cs.CL · 2026-04-07 · unverdicted · novelty 5.0

A training framework perturbs self-conditioning signals in diffusion language models to match few-step inference noise, enabling up to 400x faster sampling while surpassing standard continuous diffusion performance on sequence-to-sequence tasks.

Beyond Execution: Static-Analysis Rewards and Hint-Conditioned Diffusion RL for Code Generation

cs.SE · 2026-05-16 · unverdicted · novelty 4.0

Static checking rewards and moderate AST-based hints improve diffusion RL performance for code generation, with effectiveness varying by task difficulty across HumanEval, MBPP, and LiveCodeBench.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Efficient Long-Context Modeling in Diffusion Language Models via Block Approximate Sparse Attention cs.CV · 2026-05-19 · unverdicted · none · ref 64
BA-Att introduces pre-downsampled block selection with norm-sorting and diagonal covariance correction to approximate sparse attention, yielding up to 6.95x speedup at 50% sparsity across language, multimodal, and video models.

A reparameterized discrete diffusion model for text generation

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer