Recognition: unknown
Binomial flows: Denoising and flow matching for discrete ordinal data
Pith reviewed 2026-05-09 20:03 UTC · model grok-4.3
The pith
Binomial flows adapt continuous denoising techniques to discrete ordinal data so one model can denoise, sample, and return exact likelihoods.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Binomial flows close the gap between continuous and discrete diffusion by extending Tweedie's formula to discrete non-negative ordinal data. The resulting framework supplies a single training recipe that produces a denoiser usable for both sampling trajectories and exact likelihood evaluation.
What carries the argument
Binomial flows, the discrete analogue of continuous probability flows that preserves the Tweedie relation between denoiser and score while enabling exact likelihood computation.
If this is right
- A single trained model can perform denoising, generate new samples, and report exact likelihoods without separate networks or approximations.
- The method applies directly to any non-negative integer-valued ordinal data such as pixel intensities or count statistics.
- Training reduces to standard denoising objectives while sampling follows the same flow-matching procedure used in continuous spaces.
- Competitive performance is observed on both synthetic distributions and real-world ordinal datasets.
Where Pith is reading between the lines
- The same binomial construction might extend to other discrete structures such as categorical variables if an appropriate flow is defined.
- Hybrid models could route continuous and ordinal features through separate binomial and Gaussian flows inside one network.
- Exact likelihoods open the door to using these models for tasks that require calibrated uncertainty, such as anomaly detection on count data.
Load-bearing premise
The relation between a learned denoiser and the true score that holds in continuous spaces can be carried over to binomial distributions without losing the ability to compute exact likelihoods.
What would settle it
Train the model on a small synthetic binomial dataset whose true likelihoods are known by direct enumeration; if the model's likelihood estimates deviate systematically from the enumerated values, the extension fails.
Figures
read the original abstract
Flow-based generative modeling in continuous spaces exploit Tweedie's formula to express the denoiser (learned in training) as a score function (used in sampling). In contrast, this relation has been largely missing in the discrete setting where common approaches focus on learning discrete scores and rates. In this work we close this gap for discrete non-negative ordinal data by introducing Binomial flows. Our framework provides a simple recipe for training a discrete diffusion model which simultaneously denoises, samples, and estimates exact likelihoods. We verify our methodology on synthetic examples and obtain competitive results on real-world data sets.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces Binomial flows, a framework adapting continuous flow-matching and Tweedie's formula to discrete non-negative ordinal data. It claims this yields a simple recipe for training a discrete diffusion model that simultaneously supports denoising via regression, sampling via the implied score, and exact likelihood estimation, with verification on synthetic examples and competitive results on real-world datasets.
Significance. If the central claims hold, the work would be significant for bridging continuous and discrete generative modeling by enabling exact likelihoods in the discrete ordinal setting, where most prior approaches focus only on scores or rates. This could unify denoising, sampling, and likelihood computation under one parameterization for ordinal data applications.
major comments (2)
- [§3.2 and §4.1] §3.2 (Binomial flow construction) and §4.1 (likelihood derivation): the central claim that the framework yields exact likelihoods requires an explicit, tractable discrete change-of-variables formula. The ratio of binomial probabilities introduces normalizing constants that depend on the learned drift; without a closed-form expression or efficient marginalization that avoids summation over latent trajectories, the 'exact' part of the claim is not yet demonstrated.
- [§4.2] §4.2 (extension of Tweedie's formula): the adaptation from continuous to discrete ordinal via binomial counts is presented as preserving the denoiser-score relation for both sampling and likelihoods, but the manuscript does not provide an explicit verification that the discrete Radon-Nikodym analogue remains parameter-free or closed-form after the binomial reparameterization.
minor comments (2)
- Notation for the binomial probability path (e.g., the definition of the drift term) is introduced without a clear table comparing it to the continuous case, which would aid readability.
- The synthetic verification section would benefit from an explicit statement of the number of latent trajectories summed (or avoided) when reporting likelihood values.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive feedback. We address each major comment below and will incorporate clarifications to strengthen the presentation of the exact likelihood and discrete Tweedie derivations.
read point-by-point responses
-
Referee: [§3.2 and §4.1] §3.2 (Binomial flow construction) and §4.1 (likelihood derivation): the central claim that the framework yields exact likelihoods requires an explicit, tractable discrete change-of-variables formula. The ratio of binomial probabilities introduces normalizing constants that depend on the learned drift; without a closed-form expression or efficient marginalization that avoids summation over latent trajectories, the 'exact' part of the claim is not yet demonstrated.
Authors: We agree that §4.1 expresses the likelihood via the discrete change-of-variables formula involving the ratio of binomial transition probabilities, and that this ratio depends on the learned drift. The binomial flow construction in §3.2 ensures that the marginal probability can be obtained exactly by a product of binomial conditionals along the flow path, without enumerating all trajectories, because the transitions factorize across dimensions. The normalizing constants are absorbed into the regression target during training and do not require separate summation at inference time. In the revision we will add an explicit algorithm box and a short derivation showing the cancellation that yields a closed-form marginal likelihood for any fixed drift parameterization. revision: yes
-
Referee: [§4.2] §4.2 (extension of Tweedie's formula): the adaptation from continuous to discrete ordinal via binomial counts is presented as preserving the denoiser-score relation for both sampling and likelihoods, but the manuscript does not provide an explicit verification that the discrete Radon-Nikodym analogue remains parameter-free or closed-form after the binomial reparameterization.
Authors: The derivation in §4.2 shows that the discrete score is recovered from the binomial denoiser by a ratio whose parameter-dependent terms cancel exactly because the binomial transition kernel is chosen to match the flow-matching objective. This cancellation is verified numerically on the synthetic examples, where the implied score matches the ground-truth score. To make the verification fully explicit, we will insert a short proposition in the revised §4.2 that states the discrete Radon-Nikodym derivative and proves it is independent of the learned parameters in the ratio, mirroring the continuous case. revision: yes
Circularity Check
No circularity: binomial flow construction is independently derived from discrete probability paths
full rationale
The paper defines binomial flows as a new parameterization for discrete ordinal data that extends Tweedie's formula via explicit ratios of binomial probabilities. The denoiser, score, sampling, and likelihood estimation are all obtained directly from the constructed probability path and the associated discrete change-of-variables formula; none of these quantities is obtained by fitting a parameter to a held-out subset and then relabeling the fit as a prediction. No load-bearing step relies on a self-citation whose content is itself unverified or on an ansatz imported without independent justification. The derivation therefore remains self-contained against external benchmarks and does not reduce to its inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Tweedie's formula relating denoiser to score function extends to discrete non-negative ordinal data when using binomial flows.
Reference graph
Works this paper leans on
-
[1]
Structured denoising diffusion models in discrete state-spaces
Jacob Austin, Daniel D Johnson, Jonathan Ho, Daniel Tarlow, and Rianne Van Den Berg. Structured denoising diffusion models in discrete state-spaces. Advances in neural information processing systems , 34:17981--17993, 2021
2021
-
[2]
Building normalizing flows with stochastic interpolants
Michael Samuel Albergo and Eric Vanden-Eijnden. Building normalizing flows with stochastic interpolants. In The Eleventh International Conference on Learning Representations , 2023
2023
-
[3]
Variational representations for continuous time processes
Amarjit Budhiraja, Paul Dupuis, and Vasileios Maroulas. Variational representations for continuous time processes. Ann. Inst. Henri Poincar\'e Probab. Stat. , 47(3):725--747, 2011
2011
-
[4]
It DPDM : Information-theoretic discrete poisson diffusion model
Sagnik Bhattacharya, Abhiram Rao Gorle, Ahsan Bilal, Connor Ding, Amit Kumar Singh Yadav, and Tsachy Weissman. It DPDM : Information-theoretic discrete poisson diffusion model. In The Thirty-ninth Annual Conference on Neural Information Processing Systems , 2025
2025
-
[5]
On the optimality of conditional expectation as a B regman predictor
Arindam Banerjee, Xin Guo, and Hui Wang. On the optimality of conditional expectation as a B regman predictor. IEEE Transactions on Information Theory , 51(7):2664--2669, 2005
2005
-
[6]
Sampling binary data by denoising through score functions
Francis Bach and Saeed Saremi. Sampling binary data by denoising through score functions. In Forty-second International Conference on Machine Learning , 2025
2025
-
[7]
Diffusion schr\"odinger bridge with applications to score-based generative modeling
Valentin De Bortoli, James Thornton, Jeremy Heng, and Arnaud Doucet. Diffusion schr\"odinger bridge with applications to score-based generative modeling. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems , 2021
2021
-
[8]
A continuous time framework for discrete denoising models
Andrew Campbell, Joe Benton, Valentin De Bortoli, Tom Rainforth, George Deligiannidis, and Arnaud Doucet. A continuous time framework for discrete denoising models. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems , 2022
2022
-
[9]
Probabilistic forecasting with stochastic interpolants and F \"ollmer processes
Yifan Chen, Mark Goldstein, Mengjian Hua, Michael Samuel Albergo, Nicholas Matthew Boffi, and Eric Vanden-Eijnden. Probabilistic forecasting with stochastic interpolants and F \"ollmer processes. In Forty-first International Conference on Machine Learning , 2024
2024
-
[10]
Reciprocal classes of random walks on graphs
Giovanni Conforti and Christian L\'eonard. Reciprocal classes of random walks on graphs. Stochastic Process. Appl. , 127(6):1870--1896, 2017
2017
- [11]
-
[12]
Generative flows on discrete state-spaces: Enabling multimodal flows with applications to protein co-design
Andrew Campbell, Jason Yim, Regina Barzilay, Tom Rainforth, and Tommi Jaakkola. Generative flows on discrete state-spaces: Enabling multimodal flows with applications to protein co-design. In Forty-first International Conference on Machine Learning , 2024
2024
-
[13]
Learning to jump: Thinning and thickening latent counts for generative modeling
Tianqi Chen and Mingyuan Zhou. Learning to jump: Thinning and thickening latent counts for generative modeling. In International Conference on Machine Learning , pages 5367--5382. PMLR, 2023
2023
-
[14]
Consistent diffusion models: Mitigating sampling drift by learning to be consistent
Giannis Daras, Yuval Dagan, Alex Dimakis, and Constantinos Costis Daskalakis. Consistent diffusion models: Mitigating sampling drift by learning to be consistent. In Thirty-seventh Conference on Neural Information Processing Systems , 2023
2023
-
[15]
Itai Gat, Tal Remez, Neta Shaul, Felix Kreuk, Ricky T. Q. Chen, Gabriel Synnaeve, Yossi Adi, and Yaron Lipman. Discrete flow matching. In The Thirty-eighth Annual Conference on Neural Information Processing Systems , 2024
2024
-
[16]
Denoising diffusion probabilistic models
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. Advances in neural information processing systems , 33:6840--6851, 2020
2020
-
[17]
Argmax flows and multinomial diffusion: Learning categorical distributions
Emiel Hoogeboom, Didrik Nielsen, Priyank Jaini, Patrick Forr \'e , and Max Welling. Argmax flows and multinomial diffusion: Learning categorical distributions. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems , 2021
2021
-
[18]
Elucidating the design space of diffusion-based generative models
Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion-based generative models. Advances in neural information processing systems , 35:26565--26577, 2022
2022
-
[19]
Categorical S chr\"odinger bridge matching
Grigoriy Ksenofontov and Alexander Korotin. Categorical S chr\"odinger bridge matching. In Forty-second International Conference on Machine Learning , 2025
2025
-
[20]
Discrete diffusion schr\"odinger bridge matching for graph transformation
Jun Hyeong Kim, Seonghwan Kim, Seokhyun Moon, Hyeongwoo Kim, Jeheon Woo, and Woo Youn Kim. Discrete diffusion schr\"odinger bridge matching for graph transformation. In The Thirteenth International Conference on Learning Representations , 2025
2025
-
[21]
Poisson processes and a log-concave B ernstein theorem
Bo'az Klartag and Joseph Lehec. Poisson processes and a log-concave B ernstein theorem. Studia Math. , 247(1):85--107, 2019
2019
-
[22]
A survey of the S chr\"odinger problem and some of its connections with optimal transport
Christian L\'eonard. A survey of the S chr\"odinger problem and some of its connections with optimal transport. Discrete Contin. Dyn. Syst. , 34(4):1533--1574, 2014
2014
-
[23]
Yaron Lipman, Ricky T. Q. Chen, Heli Ben-Hamu, Maximilian Nickel, and Matthew Le. Flow matching for generative modeling. In The Eleventh International Conference on Learning Representations , 2023
2023
-
[24]
Flow straight and fast: Learning to generate and transfer data with rectified flow
Xingchao Liu, Chengyue Gong, and qiang liu. Flow straight and fast: Learning to generate and transfer data with rectified flow. In The Eleventh International Conference on Learning Representations , 2023
2023
-
[25]
Back to Basics: Let Denoising Generative Models Denoise
Tianhong Li and Kaiming He. Back to basics: Let denoising generative models denoise. arXiv preprint arXiv:2511.13720 , 2025
work page internal anchor Pith review arXiv 2025
-
[26]
Yaron Lipman, Marton Havasi, Peter Holderrieth, Neta Shaul, Matt Le, Brian Karrer, Ricky TQ Chen, David Lopez-Paz, Heli Ben-Hamu, and Itai Gat. Flow matching guide and code. arXiv preprint arXiv:2412.06264 , 2024
work page internal anchor Pith review arXiv 2024
-
[27]
Discrete diffusion modeling by estimating the ratios of the data distribution
Aaron Lou, Chenlin Meng, and Stefano Ermon. Discrete diffusion modeling by estimating the ratios of the data distribution. In Forty-first International Conference on Machine Learning , 2024
2024
-
[28]
Think while you generate: Discrete diffusion with planned denoising
Sulin Liu, Juno Nam, Andrew Campbell, Hannes Stark, Yilun Xu, Tommi Jaakkola, and Rafael Gomez-Bombarelli. Think while you generate: Discrete diffusion with planned denoising. In The Thirteenth International Conference on Learning Representations , 2025
2025
-
[29]
Andrea Montanari. Sampling, diffusions, and stochastic localization. arXiv preprint arXiv:2305.10690 , 2023
-
[30]
The B rownian transport map
Dan Mikulincer and Yair Shenfeld. The B rownian transport map. Probab. Theory Related Fields , 190(1-2):379--444, 2024
2024
-
[31]
Improved denoising diffusion probabilistic models
Alexander Quinn Nichol and Prafulla Dhariwal. Improved denoising diffusion probabilistic models. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning , volume 139 of Proceedings of Machine Learning Research , pages 8162--8171. PMLR, 18--24 Jul 2021
2021
-
[32]
Hunter Nisonoff, Junhao Xiong, Stephan Allenspach, and Jennifer Listgarten. Unlocking guidance for discrete state-space diffusion and flow models. In International Conference on Learning Representations , 2025. arXiv:2406.01572
-
[33]
Diffusion bridge mixture transports, schr \"o inger bridge problems and generative modeling
Stefano Peluchetti. Diffusion bridge mixture transports, schr \"o inger bridge problems and generative modeling. Journal of Machine Learning Research , 24(374):1--51, 2023
2023
-
[34]
Discrete markov probabilistic models: An improved discrete score-based framework with sharp convergence bounds under minimal assumptions
Le-Tuyet-Nhi Pham, Dario Shariatian, Antonio Ocello, Giovanni Conforti, and Alain Oliviero Durmus. Discrete markov probabilistic models: An improved discrete score-based framework with sharp convergence bounds under minimal assumptions. In Forty-second International Conference on Machine Learning , 2025
2025
-
[35]
Rotskoff, and Lexing Ying
Yinuo Ren, Haoxuan Chen, Grant M. Rotskoff, and Lexing Ying. How discrete and continuous diffusion meet: Comprehensive analysis of discrete diffusion models via a stochastic integral framework. In The Thirteenth International Conference on Learning Representations , 2025
2025
-
[36]
Diffusion S chr\"odinger bridge matching
Yuyang Shi, Valentin De Bortoli, Andrew Campbell, and Arnaud Doucet. Diffusion S chr\"odinger bridge matching. In Thirty-seventh Conference on Neural Information Processing Systems , 2023
2023
-
[37]
Generative modeling by estimating gradients of the data distribution
Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution. In Advances in neural information processing systems , volume 32, 2019
2019
-
[38]
Blackout diffusion: generative diffusion models in discrete-state spaces
Javier E Santos, Zachary R Fox, Nicholas Lubbers, and Yen Ting Lin. Blackout diffusion: generative diffusion models in discrete-state spaces. In International Conference on Machine Learning , pages 9034--9059. PMLR, 2023
2023
-
[39]
Denoising diffusion implicit models
Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models. In International Conference on Learning Representations , 2021
2021
-
[40]
Score-based generative modeling through stochastic differential equations
Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations , 2021
2021
-
[41]
Theoretical guarantees for sampling and inference in generative models with latent diffusions
Belinda Tzen and Maxim Raginsky. Theoretical guarantees for sampling and inference in generative models with latent diffusions. In Conference on Learning Theory , pages 3084--3114. PMLR, 2019
2019
-
[42]
Continuously augmented dis- crete diffusion model for categorical generative modeling
Huangjie Zheng, Shansan Gong, Ruixiang Zhang, Tianrong Chen, Jiatao Gu, Mingyuan Zhou, Navdeep Jaitly, and Yizhe Zhang. Continuously augmented discrete diffusion model for categorical generative modeling. arXiv preprint arXiv:2510.01329 , 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.