pith. sign in

hub Canonical reference

Continuous diffusion for categorical data

Canonical reference. 75% of citing Pith papers cite this work as background.

30 Pith papers citing it
Background 75% of classified citations
abstract

Diffusion models have quickly become the go-to paradigm for generative modelling of perceptual signals (such as images and sound) through iterative refinement. Their success hinges on the fact that the underlying physical phenomena are continuous. For inherently discrete and categorical data such as language, various diffusion-inspired alternatives have been proposed. However, the continuous nature of diffusion models conveys many benefits, and in this work we endeavour to preserve it. We propose CDCD, a framework for modelling categorical data with diffusion models that are continuous both in time and input space. We demonstrate its efficacy on several language modelling tasks.

hub tools

citation-role summary

background 7 method 1

citation-polarity summary

clear filters

representative citing papers

Large Language Diffusion Models

cs.CL · 2025-02-14 · unverdicted · novelty 8.0

LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.

Variational Learning for Insertion-based Generation

cs.LG · 2026-06-01 · unverdicted · novelty 7.0

Introduces the Insertion Process model for variable-length non-monotonic sequence generation via a bijective permutation mapping and permutation-based variational inference.

Infinite Mask Diffusion for Few-Step Distillation

cs.CL · 2026-05-11 · unverdicted · novelty 7.0

Infinite Mask Diffusion Models use stochastic infinite-state masks to overcome the factorization error lower bound in standard masked diffusion, achieving superior few-step performance on language tasks via distillation.

Discrete Stochastic Localization for Non-autoregressive Generation

cs.LG · 2026-02-18 · unverdicted · novelty 7.0

Discrete Stochastic Localization lets a single trained network support an entire family of per-token SNR paths for discrete sequence generation, with masked diffusion as a special case, and improves MAUVE scores when fine-tuning pretrained checkpoints.

Diffusion and Flow Matching Models for Tabular Data: A Survey

cs.LG · 2025-02-24 · unverdicted · novelty 7.0

First dedicated survey organizing diffusion and flow matching models for tabular data synthesis, imputation, anomaly detection, and related tasks, covering literature from 2015 to 2026 and highlighting open problems.

DSL-LLaDA: Scaling Continuous Denoising to 8B Masked Diffusion LMs

cs.CL · 2026-05-31 · unverdicted · novelty 6.0

Adapting LLaDA-8B-Instruct via Discrete Stochastic Localization with continuous per-token Gaussian noise yields continuous denoising that achieves top ROUGE-1 on zero-shot summarization at low step budgets and adds selective noisy-state robustness.

Fixed-Point Masked Generative Modeling

cs.LG · 2026-05-29 · unverdicted · novelty 6.0

FP-MGMs with consistency loss and three-state reuse (CoFRe) reduce parameters by up to 38.8% and improve low-budget perplexity and FID versus standard masked generative models on text and images.

Discrete Stochastic Localization for Non-autoregressive Generation

cs.LG · 2026-05-13 · unverdicted · novelty 6.0

DSL provides a continuous embedding framework where one denoiser supports a family of SNR paths for discrete sequences, improving MAUVE scores on OpenWebText and allowing random-order and hybrid sampling from a fine-tuned MDLM checkpoint.

ELF: Embedded Language Flows

cs.CL · 2026-05-11 · unverdicted · novelty 6.0 · 2 refs

ELF applies continuous-time flow matching in embedding space for language generation and reports outperforming prior discrete and continuous diffusion language models with fewer steps.

TextLDM: Language Modeling with Continuous Latent Diffusion

cs.CL · 2026-05-08 · unverdicted · novelty 6.0

TextLDM applies DiT-style latent diffusion with flow matching to language modeling via a REPA-aligned VAE, outperforming prior diffusion LMs and matching GPT-2 when trained from scratch on OpenWebText2.

Continuous Latent Diffusion Language Model

cs.CL · 2026-05-07 · unverdicted · novelty 6.0

Cola DLM proposes a hierarchical latent diffusion model that learns a text-to-latent mapping, fits a global semantic prior in continuous space with a block-causal DiT, and performs conditional decoding, establishing latent prior modeling as an alternative to token-level autoregressive language model

Dream 7B: Diffusion Large Language Models

cs.CL · 2025-08-21 · unverdicted · novelty 6.0

Dream 7B is a 7B diffusion LLM that refines sequences in parallel via denoising and outperforms prior diffusion models on general, mathematical, and coding benchmarks with added flexibility in generation order and quality-speed tradeoffs.

citing papers explorer

Showing 7 of 7 citing papers after filters.

  • Large Language Diffusion Models cs.CL · 2025-02-14 · unverdicted · none · ref 47 · internal anchor

    LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.

  • Focus on the Core: Empowering Diffusion Large Language Models by Self-Contrast cs.CL · 2026-05-02 · unverdicted · none · ref 9 · internal anchor

    FoCore uses self-contrast on early-converging high-density tokens to boost diffusion LLM quality on reasoning benchmarks while cutting decoding steps by over 2x.

  • Diffusion and Flow Matching Models for Tabular Data: A Survey cs.LG · 2025-02-24 · unverdicted · none · ref 101 · internal anchor

    First dedicated survey organizing diffusion and flow matching models for tabular data synthesis, imputation, anomaly detection, and related tasks, covering literature from 2015 to 2026 and highlighting open problems.

  • ELF: Embedded Language Flows cs.CL · 2026-05-11 · unverdicted · none · ref 13 · 2 links · internal anchor

    ELF applies continuous-time flow matching in embedding space for language generation and reports outperforming prior discrete and continuous diffusion language models with fewer steps.

  • Continuous Latent Diffusion Language Model cs.CL · 2026-05-07 · unverdicted · none · ref 21 · internal anchor

    Cola DLM proposes a hierarchical latent diffusion model that learns a text-to-latent mapping, fits a global semantic prior in continuous space with a block-causal DiT, and performs conditional decoding, establishing latent prior modeling as an alternative to token-level autoregressive language model

  • Thinking Diffusion: Penalize and Guide Visual-Grounded Reasoning in Diffusion Multimodal Language Models cs.AI · 2026-04-07 · unverdicted · none · ref 5 · internal anchor

    Position and step penalty plus visual reasoning guidance fix premature answering and weak visual grounding in diffusion MLLMs, delivering up to 7.5% accuracy gains and over 3x speedup.

  • LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning cs.LG · 2025-05-22 · conditional · none · ref 90 · internal anchor

    LLaDA-V is a diffusion-based multimodal large language model that reaches competitive or state-of-the-art results on visual instruction tasks while using a non-autoregressive architecture.