Your GFlowNet Secretly Learns an Optimal Transport Plan
Pith reviewed 2026-06-28 02:32 UTC · model grok-4.3
The pith
Fixing the initial flow distribution in a minimum-flow GFlowNet reduces its objective to a Kantorovich optimal transport problem whose solution is recovered by sampling the trained policy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Fixing the initial flow distribution in a minimum-flow GFlowNet reduces its objective to a Kantorovich OT problem with graph-induced shortest path costs. At the optimum, the learned GFlowNet policy therefore encodes an optimal transport plan from the source distribution to the target distribution: sampling trajectories from the minimum-flow GFlowNet recovers the corresponding optimal coupling.
What carries the argument
The reduction of the minimum-flow GFlowNet objective (with fixed initial distribution) to the Kantorovich formulation of optimal transport, where edge costs are shortest-path distances on the graph.
If this is right
- GFlowNet training can be used to solve optimal transport problems on large graphs by learning edge flows with neural networks.
- The optimal transport plan is recovered simply by sampling trajectories from the trained GFlowNet.
- The equivalence holds only for the minimum-flow objective and fixed initial distribution on non-acyclic graphs.
Where Pith is reading between the lines
- Standard OT solvers could be used to warm-start or regularize GFlowNet training on moderate-sized graphs.
- The link suggests that other flow-based generative models might admit similar reductions to transport problems under appropriate objectives.
- GFlowNets could scale optimal transport computations to graphs that are too large for exact linear programming solvers.
Load-bearing premise
The GFlowNet must be non-acyclic and trained under the minimum-flow objective with a fixed initial flow distribution.
What would settle it
Compute the exact optimal coupling via a standard OT solver on the same graph using shortest-path costs, then compare it to the empirical distribution of trajectories sampled from a converged minimum-flow GFlowNet; any systematic mismatch falsifies the claimed equivalence.
Figures
read the original abstract
Generative Flow Networks (GFlowNets) are a framework for sampling structured objects via stochastic trajectories in a directed graph. In this work, we establish a theoretical connection between non-acyclic GFlowNets and optimal transport (OT). We show that fixing the initial flow distribution in a minimum-flow GFlowNet reduces its objective to a Kantorovich OT problem with graph-induced shortest path costs. At the optimum, the learned GFlowNet policy therefore encodes an optimal transport plan from the source distribution to the target distribution: we show that sampling trajectories from the minimum-flow GFlowNet recovers the corresponding optimal coupling. Our formulation enables applying the GFlowNet learning framework to OT problems on large graphs via edge flows and neural parameterization. Experiments confirm agreement with exact OT solvers and demonstrate that GFlowNets can learn high-quality transport plans.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that fixing the initial flow distribution in a minimum-flow GFlowNet on a non-acyclic directed graph reduces the GFlowNet objective exactly to a Kantorovich optimal transport problem whose ground cost is the shortest-path distance induced by the graph. Consequently the learned edge flows define an optimal coupling whose marginals match the prescribed source and target distributions, and sampling trajectories from the trained policy recovers that coupling. The formulation is then used to solve OT instances on large graphs via neural parameterization of the flows, with experiments reported to match exact solvers.
Significance. If the reduction is correct, the result supplies a new, scalable route to OT on graphs by importing the GFlowNet training machinery and neural function approximation. Credit is due for the explicit reduction to the standard Kantorovich formulation (no free parameters introduced) and for the experimental verification against exact OT solvers.
minor comments (3)
- [§3.2] §3.2, Eq. (8): the statement that the minimum-flow objective is 'identical' to the OT objective should be accompanied by an explicit side-by-side display of the two loss expressions to make the algebraic equivalence immediate.
- [Experiments] The experimental section reports agreement with exact solvers but does not state the precise graph sizes or the number of independent runs; adding these details would strengthen reproducibility.
- Notation: the symbol P is used both for the GFlowNet policy and for the OT coupling; a brief remark distinguishing the two usages would avoid confusion.
Simulated Author's Rebuttal
We thank the referee for the accurate summary of our manuscript and for the positive assessment of its significance. We appreciate the recommendation for minor revision. As the report lists no specific major comments, we have no points requiring point-by-point response or manuscript changes.
Circularity Check
No significant circularity; derivation reduces to external OT formulation
full rationale
The central claim is a direct mathematical reduction: fixing the initial flow in the minimum-flow GFlowNet objective yields the Kantorovich OT problem with shortest-path costs induced by the graph. This follows from flow conservation and the objective definition without any self-referential fitting, self-citation load-bearing, or ansatz smuggling. The result is externally falsifiable via exact OT solvers, and experiments confirm agreement. No step in the provided derivation chain reduces by construction to the paper's own inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Kantorovich formulation of optimal transport
Reference graph
Works this paper leans on
-
[1]
Optimizing Backward Policies in
Gritsaev, Timofei and Morozov, Nikita and Samsonov, Sergey and Tiapkin, Daniil , booktitle=. Optimizing Backward Policies in
-
[2]
arXiv preprint arXiv:2603.01786 , year=
Learning Shortest Paths with Generative Flow Networks , author=. arXiv preprint arXiv:2603.01786 , year=
-
[3]
International Journal of Computer Vision , volume=
The Earth Mover's Distance as a Metric for Image Retrieval , author=. International Journal of Computer Vision , volume=
-
[4]
Proceedings of the 32nd International Conference on Machine Learning , pages=
From Word Embeddings to Document Distances , author=. Proceedings of the 32nd International Conference on Machine Learning , pages=
-
[5]
IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=
Optimal Transport for Domain Adaptation , author=. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=
-
[6]
Proceedings of the 34th International Conference on Machine Learning , pages=
Wasserstein Generative Adversarial Networks , author=. Proceedings of the 34th International Conference on Machine Learning , pages=
-
[7]
Advances in Neural Information Processing Systems , year=
Sinkhorn Distances: Lightspeed Computation of Optimal Transport , author=. Advances in Neural Information Processing Systems , year=
-
[8]
and Haberland, Matt and Reddy, Tyler and Cournapeau, David and Burovski, Evgeni and Peterson, Pearu and Weckesser, Warren and Bright, Jonathan and
Virtanen, Pauli and Gommers, Ralf and Oliphant, Travis E. and Haberland, Matt and Reddy, Tyler and Cournapeau, David and Burovski, Evgeni and Peterson, Pearu and Weckesser, Warren and Bright, Jonathan and. Nature Methods , year =
-
[9]
2008 , publisher=
Optimal Transport: Old and New , author=. 2008 , publisher=
2008
-
[10]
Econometrica , volume =
Martin Beckmann , title =. Econometrica , volume =
-
[11]
, title =
Dantzig, George B. , title =. Activity Analysis of Production and Allocation , editor =
-
[12]
SIAM Journal on Scientific Computing , volume=
Quadratically Regularized Optimal Transport on Graphs , author=. SIAM Journal on Scientific Computing , volume=. 2018 , publisher=. doi:10.1137/17M1132665 , url=
-
[13]
arXiv preprint arXiv:2509.20408 , year=
A Theory of Multi-Agent Generative Flow Networks , author=. arXiv preprint arXiv:2509.20408 , year=
-
[14]
International Conference on Machine Learning , pages=
Random Policy Evaluation Uncovers Policies of Generative Flow Networks , author=. International Conference on Machine Learning , pages=. 2025 , organization=
2025
-
[15]
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=
Gdpo: Learning to directly align language models with diversity using gflownets , author=. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=
2024
-
[16]
arXiv preprint arXiv:2509.15207 , year=
Flowrl: Matching reward distributions for llm reasoning , author=. arXiv preprint arXiv:2509.15207 , year=
-
[17]
SIAM Journal on Mathematical Analysis , volume=
Barycenters in the Wasserstein space , author=. SIAM Journal on Mathematical Analysis , volume=. 2011 , publisher=
2011
-
[18]
Econometrica: Journal of the Econometric Society , pages=
A continuous model of transportation , author=. Econometrica: Journal of the Econometric Society , pages=. 1952 , publisher=
1952
-
[19]
International Conference on Learning Representations , year=
Wasserstein Auto-Encoders , author=. International Conference on Learning Representations , year=
-
[20]
Journal of Machine Learning Research , volume=
POT: Python Optimal Transport , author=. Journal of Machine Learning Research , volume=
-
[21]
Revisiting Non-Acyclic
Morozov, Nikita and Maksimov, Ian and Tiapkin, Daniil and Samsonov, Sergey , booktitle =. Revisiting Non-Acyclic. 2025 , volume =
2025
-
[22]
IEEE transactions on Systems Science and Cybernetics , volume=
A formal basis for the heuristic determination of minimum cost paths , author=. IEEE transactions on Systems Science and Cybernetics , volume=. 1968 , publisher=
1968
-
[23]
and Moulines, E
Douc, R. and Moulines, E. and Priouret, P. and Soulier, P. , TITLE =. 2018 , PAGES =
2018
-
[24]
Scaling Learning Algorithms Towards
Bengio, Yoshua and LeCun, Yann , booktitle =. Scaling Learning Algorithms Towards
-
[25]
2017 , publisher=
Markov chains and mixing times , author=. 2017 , publisher=
2017
-
[26]
and Osindero, Simon and Teh, Yee Whye , journal =
Hinton, Geoffrey E. and Osindero, Simon and Teh, Yee Whye , journal =. A Fast Learning Algorithm for Deep Belief Nets , volume =
-
[27]
2016 , publisher=
Deep learning , author=. 2016 , publisher=
2016
-
[28]
Langley , title =
P. Langley , title =. Proceedings of the 17th International Conference on Machine Learning (ICML 2000) , address =. 2000 , pages =
2000
-
[29]
T. M. Mitchell. The Need for Biases in Learning Generalizations. 1980
1980
-
[30]
M. J. Kearns , title =
-
[31]
Machine Learning: An Artificial Intelligence Approach, Vol. I. 1983
1983
-
[32]
R. O. Duda and P. E. Hart and D. G. Stork. Pattern Classification. 2000
2000
-
[33]
Suppressed for Anonymity , author=
-
[34]
Newell and P
A. Newell and P. S. Rosenbloom. Mechanisms of Skill Acquisition and the Law of Practice. Cognitive Skills and Their Acquisition. 1981
1981
-
[35]
A. L. Samuel. Some Studies in Machine Learning Using the Game of Checkers. IBM Journal of Research and Development. 1959
1959
-
[36]
Advances in Neural Information Processing Systems , volume=
Flow network based generative models for non-iterative diverse candidate generation , author=. Advances in Neural Information Processing Systems , volume=
-
[37]
2024 , eprint=
Looking Backward: Retrospective Backward Synthesis for Goal-Conditioned GFlowNets , author=. 2024 , eprint=
2024
-
[38]
Journal of Physics: Conference Series , volume=
HPC resources of the higher school of economics , author=. Journal of Physics: Conference Series , volume=. 2021 , organization=
2021
-
[39]
Advances in Neural Information Processing Systems , volume=
Maximum entropy monte-carlo planning , author=. Advances in Neural Information Processing Systems , volume=
-
[40]
International Conference on Machine Learning , pages=
Learning GFlowNets from partial episodes for improved convergence and stability , author=. International Conference on Machine Learning , pages=. 2023 , organization=
2023
-
[41]
Advances in Neural Information Processing Systems , volume=
Trajectory balance: Improved credit assignment in GFlowNets , author=. Advances in Neural Information Processing Systems , volume=
-
[42]
International Conference on Artificial Intelligence and Statistics , pages=
Generative flow networks as entropy-regularized rl , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2024 , organization=
2024
-
[43]
International conference on machine learning , pages=
Reinforcement learning with deep energy-based policies , author=. International conference on machine learning , pages=. 2017 , organization=
2017
-
[44]
Nature , volume=
Mastering atari, go, chess and shogi by planning with a learned model , author=. Nature , volume=. 2020 , publisher=
2020
-
[45]
Journal of Machine Learning Research , volume=
Gflownet foundations , author=. Journal of Machine Learning Research , volume=
-
[46]
1974 , publisher=
Advanced Combinatorics: The Art of Finite and Infinite Expansions , author=. 1974 , publisher=
1974
-
[47]
The Twelfth International Conference on Learning Representations , year=
Amortizing intractable inference in large language models , author=. The Twelfth International Conference on Learning Representations , year=
-
[48]
The Twelfth International Conference on Learning Representations , year=
Order-Preserving GFlowNets , author=. The Twelfth International Conference on Learning Representations , year=
-
[49]
International Conference on Machine Learning , pages=
Biological sequence design with gflownets , author=. International Conference on Machine Learning , pages=. 2022 , organization=
2022
-
[50]
Let the Flows Tell: Solving Graph Combinatorial Problems with GFlowNets , volume =
Zhang, Dinghuai and Dai, Hanjun and Malkin, Nikolay and Courville, Aaron C and Bengio, Yoshua and Pan, Ling , booktitle =. Let the Flows Tell: Solving Graph Combinatorial Problems with GFlowNets , volume =
-
[51]
and Rupp, Matthias and von Lilienfeld, O
Ramakrishnan, Raghunathan and Dral, Pavlo O. and Rupp, Matthias and von Lilienfeld, O. Anatole , journal =. Quantum chemistry structures and properties of 134 kilo molecules , volume =
-
[52]
2020 , eprint=
Molecular Mechanics-Driven Graph Neural Network with Multiplex Graph for Molecular Structures , author=. 2020 , eprint=
2020
-
[53]
International Conference on Machine Learning , pages=
Multi-objective gflownets , author=. International Conference on Machine Learning , pages=. 2023 , organization=
2023
-
[54]
Advances in Neural Information Processing Systems , volume=
DynGFN: Towards Bayesian Inference of Gene Regulatory Networks with GFlowNets , author=. Advances in Neural Information Processing Systems , volume=
-
[55]
International Conference on Artificial Intelligence and Statistics , pages=
Maximum entropy GFlowNets with soft Q-learning , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2024 , organization=
2024
-
[56]
nature , volume=
Mastering the game of Go with deep neural networks and tree search , author=. nature , volume=. 2016 , publisher=
2016
-
[57]
International conference on computers and games , pages=
Efficient selectivity and backup operators in Monte-Carlo tree search , author=. International conference on computers and games , pages=. 2006 , organization=
2006
-
[58]
Science , volume =
David Silver and Thomas Hubert and Julian Schrittwieser and Ioannis Antonoglou and Matthew Lai and Arthur Guez and Marc Lanctot and Laurent Sifre and Dharshan Kumaran and Thore Graepel and Timothy Lillicrap and Karen Simonyan and Demis Hassabis , title =. Science , volume =. 2018 , doi =
2018
-
[59]
Advances in neural information processing systems , volume=
Pytorch: An imperative style, high-performance deep learning library , author=. Advances in neural information processing systems , volume=
-
[60]
Digital Discovery , volume=
Gflownets for ai-driven scientific discovery , author=. Digital Discovery , volume=. 2023 , publisher=
2023
-
[61]
Advances in neural information processing systems , volume=
Attention is all you need , author=. Advances in neural information processing systems , volume=
-
[62]
A unified view of entropy-regularized Markov decision processes
A unified view of entropy-regularized markov decision processes , author=. arXiv preprint arXiv:1705.07798 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[63]
International Conference on Machine Learning , pages=
A theory of regularized markov decision processes , author=. International Conference on Machine Learning , pages=. 2019 , organization=
2019
-
[64]
Equivalence Between Policy Gradients and Soft Q-Learning
Equivalence between policy gradients and soft q-learning , author=. arXiv preprint arXiv:1704.06440 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[65]
European conference on machine learning , pages=
Bandit based monte-carlo planning , author=. European conference on machine learning , pages=. 2006 , organization=
2006
-
[66]
nature , volume=
Human-level control through deep reinforcement learning , author=. nature , volume=. 2015 , publisher=
2015
-
[67]
Prioritized Experience Replay , booktitle =
Tom Schaul and John Quan and Ioannis Antonoglou and David Silver , editor =. Prioritized Experience Replay , booktitle =. 2016 , url =
2016
-
[68]
International Conference on Machine Learning , pages=
Better training of gflownets with local credit and incomplete trajectories , author=. International Conference on Machine Learning , pages=. 2023 , organization=
2023
-
[69]
Proceedings of the 42nd International Conference on Machine Learning , pages =
Ergodic Generative Flows , author =. Proceedings of the 42nd International Conference on Machine Learning , pages =. 2025 , volume =
2025
-
[70]
2019 , publisher=
Computational optimal transport: With applications to data science , author=. 2019 , publisher=
2019
-
[71]
2021 , publisher=
Topics in optimal transportation , author=. 2021 , publisher=
2021
-
[72]
2009 , publisher=
Optimal transport: old and new , author=. 2009 , publisher=
2009
-
[73]
The Thirteenth International Conference on Learning Representations , year=
Beyond Squared Error: Exploring Loss Design for Enhanced Training of Generative Flow Networks , author=. The Thirteenth International Conference on Learning Representations , year=
-
[74]
Advances in Neural Information Processing Systems , volume=
On divergence measures for training gflownets , author=. Advances in Neural Information Processing Systems , volume=
-
[75]
International Conference on Machine Learning , pages=
Generative flow networks for discrete probabilistic modeling , author=. International Conference on Machine Learning , pages=. 2022 , organization=
2022
-
[76]
arXiv preprint arXiv:2305.14594 , year=
torchgfn: A PyTorch GFlowNet library , author=. arXiv preprint arXiv:2305.14594 , year=
-
[77]
Taming the Noise in Reinforcement Learning via Soft Updates , booktitle =
Roy Fox and Ari Pakman and Naftali Tishby , editor =. Taming the Noise in Reinforcement Learning via Soft Updates , booktitle =. 2016 , url =
2016
-
[78]
Proceedings of the 35th International Conference on Machine Learning , pages =
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , author =. Proceedings of the 35th International Conference on Machine Learning , pages =. 2018 , editor =
2018
-
[79]
Advances in Neural Information Processing Systems , volume=
Munchausen reinforcement learning , author=. Advances in Neural Information Processing Systems , volume=
-
[80]
International Conference on Learning Representations , year=
Mirror Descent Policy Optimization , author=. International Conference on Learning Representations , year=
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.