Discrete State Diffusion Models: A Sample Complexity Perspective

Aadithya Srikanth; Mudit Gaur; Vaneet Aggarwal

arxiv: 2510.10854 · v3 · pith:IA6FKF4Anew · submitted 2025-10-12 · 💻 cs.LG · cs.AI· stat.ML

Discrete State Diffusion Models: A Sample Complexity Perspective

Aadithya Srikanth , Mudit Gaur , Vaneet Aggarwal This is my paper

classification 💻 cs.LG cs.AIstat.ML

keywords modelsdiffusiondiscrete-statecomplexitysampletheoreticalerrorestimation

0 comments

read the original abstract

Diffusion models have demonstrated remarkable performance in generating high-dimensional samples across domains such as vision, language, and the sciences. Although continuous-state diffusion models have been extensively studied both empirically and theoretically, discrete-state diffusion models, essential for applications involving text, sequences, and combinatorial structures, remain significantly less understood from a theoretical standpoint. In particular, all existing analyses of discrete-state models assume score estimation error bounds without studying sample complexity results. In this work, we present a principled theoretical framework for discrete-state diffusion, providing the first sample complexity bound of $\widetilde{\mathcal{O}}(\epsilon^{-2})$. Our structured decomposition of the score estimation error into statistical, approximation, optimization, and clipping components offers critical insights into how discrete-state models can be trained efficiently. This analysis addresses a fundamental gap in the literature and establishes the theoretical tractability and practical relevance of discrete-state diffusion models.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Streamlining Analysis and Design of Two-Dimensional Electronic Spectroscopy using Machine Learning
physics.chem-ph 2026-06 unverdicted novelty 6.0

A Gaussian mixture model is used to learn spectral densities from 2DES experiments, enabling extraction of vibronic couplings, spectral extrapolation, and optimized experiment selection across simulated and experiment...
Continuous Latent Diffusion Language Model
cs.CL 2026-05 unverdicted novelty 6.0

Cola DLM proposes a hierarchical latent diffusion model that learns a text-to-latent mapping, fits a global semantic prior in continuous space with a block-causal DiT, and performs conditional decoding, establishing l...