pith. sign in

arxiv: 2605.05569 · v3 · pith:LXWVALZLnew · submitted 2026-05-07 · 🧮 math.OC · cs.LG

Stability of the Monge Map in Semi-Dual Optimal Transport

Pith reviewed 2026-05-20 23:46 UTC · model grok-4.3

classification 🧮 math.OC cs.LG
keywords optimal transportsemi-dual formulationMonge mapsaddle-point structureconvergence conditionsconstrained optimizationdual potential
0
0 comments X

The pith

The semi-dual optimal transport problem equates to constrained optimization due to its degenerate saddle-point structure.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that the semi-dual formulation of optimal transport has a degenerate saddle-point structure. This structure makes numerical solution of the problem equivalent to solving a constrained optimization task. Necessary and sufficient conditions for convergence of the Monge map follow from this equivalence without any need to assume optimality of the dual potential. A reader would care because the result directly accounts for why practical algorithms spend more iterations updating the transport map than the potential itself.

Core claim

The semi-dual formulation of the optimal transport problem has a degenerate saddle-point structure, and its numerical solution is equivalent to solving a constrained optimization problem. Necessary and sufficient conditions for the convergence of Monge maps can be derived without requiring optimality of the dual potential.

What carries the argument

The degenerate saddle-point structure of the semi-dual optimal transport formulation, which permits equivalence to a constrained optimization problem and supports derivation of Monge-map convergence conditions independent of dual optimality.

If this is right

  • Numerical algorithms can update the Monge map using convergence conditions that ignore whether the dual potential has reached optimality.
  • The observed need for extra iterations on the transport map in practice follows directly from the degenerate saddle-point structure.
  • Stability analysis of the Monge map becomes possible in settings where dual optimality is hard to certify or enforce.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same structural argument may apply to other transport problems that exhibit degenerate saddle points, such as certain unbalanced or entropic variants.
  • Implementations could deliberately relax dual-potential updates once the derived conditions on the map are met, potentially reducing total computation time.
  • Testing the conditions on low-dimensional examples with known exact maps would provide a direct numerical check of the equivalence claim.

Load-bearing premise

The semi-dual optimal transport problem admits a degenerate saddle-point structure that makes its numerical solution equivalent to a constrained optimization problem.

What would settle it

A concrete counter-example in which the semi-dual problem fails to reduce to constrained optimization, or in which the Monge map converges even though the stated necessary and sufficient conditions are violated.

Figures

Figures reproduced from arXiv: 2605.05569 by Anton Selitskiy, David Millard.

Figure 1
Figure 1. Figure 1: Convergence of transport map with divergence of potential. view at source ↗
Figure 2
Figure 2. Figure 2: ∥t − T ⋆∥ 2 L2(µ) and ∥∇ψ − ∇ψ ⋆∥ 2 L2(ν) with respect to K and ηψ/ηt. This viewpoint also explains the empirical observation of Makkuva et al. [2020] that unconstrained neural parameterizations of the transport map (e.g., ∇v in view at source ↗
Figure 5
Figure 5. Figure 5: E.4 OTM OTM uses the same max-correlation objective, but adds the published gradient-optimality penalty from optimal transport modeling [Rout et al., 2021]. In our setup this penalty has weight 0.1. The results are shown in view at source ↗
Figure 3
Figure 3. Figure 3: Convergence behavior of the transport map and potential for the OTP method. [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗
Figure 3
Figure 3. Figure 3: Convergence behavior of the transport map and potential for the OTP method. view at source ↗
Figure 4
Figure 4. Figure 4: Convergence behavior of the transport map and potential for the Monge Map method. view at source ↗
Figure 4
Figure 4. Figure 4: Convergence behavior of the transport map and potential for the Monge Map method. [PITH_FULL_IMAGE:figures/full_fig_p017_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Convergence behavior of the transport map and potential for the Max Correlation method. view at source ↗
Figure 5
Figure 5. Figure 5: Convergence behavior of the transport map and potential for the Max Correlation method. [PITH_FULL_IMAGE:figures/full_fig_p018_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Convergence behavior of the transport map and potential for the OTM method. view at source ↗
Figure 6
Figure 6. Figure 6: Convergence behavior of the transport map and potential for the OTM method. [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗
read the original abstract

This paper shows that the semi-dual formulation of the optimal transport problem has a degenerate saddle-point structure, and that its numerical solution is equivalent to solving a constrained optimization problem. We derive necessary and sufficient conditions for the convergence of Monge maps without requiring optimality of the dual potential. This analysis helps explain why, in practice, numerical algorithms often require more iterations to update the transport map than the potential.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript shows that the semi-dual formulation of optimal transport possesses a degenerate saddle-point structure whose numerical solution is equivalent to a constrained optimization problem. It derives necessary and sufficient conditions for convergence of the Monge map that do not require optimality of the dual potential, and uses this to explain why practical algorithms typically need more iterations to stabilize the map than the potential.

Significance. If the central derivations are correct, the work supplies a useful theoretical account of observed numerical behavior in semi-dual OT solvers and separates map stability from full dual optimality. The reformulation as a constrained problem and the convergence criteria could inform algorithm design and stability analysis in applications of optimal transport.

major comments (1)
  1. §3.1, the equivalence argument: the reduction from the degenerate saddle-point to the constrained problem is stated at the level of the continuous formulation; it is not shown whether the same equivalence holds exactly for the discrete or semi-discrete discretizations that are used in the numerical section, which would be needed to justify the iteration-count explanation.
minor comments (2)
  1. Notation for the semi-dual functional and the Monge map is introduced without a consolidated table of symbols; adding one would improve readability.
  2. The numerical experiments section would benefit from a brief statement of the discretization scheme and the precise stopping criterion used to count iterations for the map versus the potential.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive assessment, the recommendation of minor revision, and the constructive comment on the equivalence argument. We address the point below and will incorporate a clarification in the revised manuscript.

read point-by-point responses
  1. Referee: §3.1, the equivalence argument: the reduction from the degenerate saddle-point to the constrained problem is stated at the level of the continuous formulation; it is not shown whether the same equivalence holds exactly for the discrete or semi-discrete discretizations that are used in the numerical section, which would be needed to justify the iteration-count explanation.

    Authors: We agree that the link to the numerical experiments would be stronger if the equivalence were stated explicitly for the discrete setting. The reduction itself is purely algebraic and follows identically when the measures are atomic: the degenerate saddle-point structure reduces to a constrained finite-dimensional optimization problem by replacing integrals with finite sums over the support points. This is the mechanism that produces the observed lag between potential and map stabilization in the experiments of Section 4. In the revised manuscript we will insert a short remark (or a brief appendix paragraph) confirming that the same equivalence holds verbatim for the discrete and semi-discrete formulations used in the numerics, thereby directly justifying the iteration-count discussion. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained

full rationale

The paper's central claims rest on observing the degenerate saddle-point structure of the semi-dual OT formulation and deriving necessary and sufficient convergence conditions for Monge maps that do not presuppose dual optimality. These steps are presented as direct consequences of the problem's convexity and marginal constraints rather than reductions to fitted inputs, self-citations, or definitional equivalences. No load-bearing premise collapses to a prior result by the same authors or to a renamed empirical pattern; the explanation for iteration counts follows from the structure without circular forcing. The analysis is therefore independent of its inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the existence of a degenerate saddle-point structure in the semi-dual OT problem and on standard background facts from convex optimization; no free parameters or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption The semi-dual formulation of optimal transport admits a degenerate saddle-point structure.
    Stated directly in the abstract as the first result shown by the paper.

pith-pipeline@v0.9.0 · 5581 in / 1151 out tokens · 31387 ms · 2026-05-20T23:46:55.409153+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

79 extracted references · 79 canonical work pages · 4 internal anchors

  1. [1]

    , title =

    Monge, G. , title =. Histoire de l’Acad\'emie Royale des Sciences avec les M\'emoires de Math\'ematique & de Physique, Paris , year =

  2. [2]

    , title =

    Kantorovich, Leonid V. , title =. Dokl. Akad. Nauk SSSR , year =

  3. [3]

    Kantorovich, L. V. and Rubinshtein G. Sh. , title =. Dokl. Akad. Nauk SSSR , year =

  4. [4]

    Brenier, Yann , title =. Comm. Pure Appl. Math. , year =

  5. [5]

    and Brenier, Y

    Benamou, J.-D. and Brenier, Y. , title =. Numer. Math. , year =

  6. [6]

    Duke Mathematical Journal , year =

    McCann, Robert , title =. Duke Mathematical Journal , year =

  7. [7]

    Pavliotis, G. A. , title =

  8. [8]

    Lipman, Yaron and Chen, Ricky T. Q. and Ben-Hamu, Heli and Nickel, Maximilian and Le Matt , title =. https://arxiv.org/abs/2210.02747 , year =

  9. [9]

    Tyrrell and Wets Roger J.-B

    Rockafellar, R. Tyrrell and Wets Roger J.-B. , title =

  10. [10]

    , title =

    Santambrogio, F. , title =

  11. [11]

    , title =

    Chen, Yongxin and Georgiou, Tryphon T. , title =. SIAM Review , year =

  12. [12]

    Vershik, A. M. , title =. Russian Mathematical Surveys , volume =

  13. [13]

    Kantorovich, L. V. , title =. Uspekhi Matematicheskikh Nauk , volume =

  14. [14]

    Bogachev, V. I. and R\". Kolmogorov Problems on Equations for Stationary and Transition Probabilities of Diffusion Processes , journal =. 2023 , doi =

  15. [15]

    Annales de l'Institut Henri Poincaré (C) Analyse Non Linéaire , volume =

    Pratelli, Aldo , title =. Annales de l'Institut Henri Poincaré (C) Analyse Non Linéaire , volume =

  16. [16]

    Bogachev, V. I. and Kalinin, A. N. and Popova, S. N. , title =. Journal of Mathematical Sciences , volume =. 2019 , note =

  17. [17]

    and Pratelli, A

    Ambrosio, L. and Pratelli, A. , title =. Optimal Transportation and Applications , series =. 2003 , pages =

  18. [18]

    , title =

    Vaserstein, Leonid N. , title =. Problemy Peredachi Informatsii , volume =

  19. [19]

    Dobrushin, R. L. , title =. Theory of Probability and Its Applications , volume =

  20. [20]

    Fr\'echet, M , title =. C. R. Acad. Sci. Paris , volume =

  21. [21]

    Wasserstein Generative Adversarial Networks , booktitle =

    Mart. Wasserstein Generative Adversarial Networks , booktitle =. 2017 , publisher =

  22. [22]

    Improved Training of Wasserstein GANs , booktitle =

    Ishaan Gulrajani and Faruk Ahmed and Mart. Improved Training of Wasserstein GANs , booktitle =. 2017 , url =

  23. [23]

    Brian D. O. Anderson , title =. Stochastic Processes and their Applications , volume =. 1982 , doi =

  24. [24]

    Tyrrell and Wets, Roger J.-B

    Rockafellar, R. Tyrrell and Wets, Roger J.-B. , title =. Pacific Journal of Mathematics , volume =

  25. [25]

    arXiv preprint arXiv:2110.02999 , year =

    Generative Modeling with Optimal Transport Maps , author =. arXiv preprint arXiv:2110.02999 , year =

  26. [26]

    Calculus of Variations and Partial Differential Equations , volume =

    Existence, Duality, and Cyclical Monotonicity for Weak Transport Costs , author =. Calculus of Variations and Partial Differential Equations , volume =. 2019 , doi =

  27. [27]

    Tyrrell , title =

    Rockafellar, R. Tyrrell , title =. Nonlinear Operators and the Calculus of Variations , editor =. 1976 , series =

  28. [28]

    2002 , series =

    Kallenberg, Olav , title =. 2002 , series =

  29. [29]

    Advances in Neural Information Processing Systems , volume=

    Sinkhorn Distances: Lightspeed Computation of Optimal Transport , author=. Advances in Neural Information Processing Systems , volume=

  30. [30]

    The American Mathematical Monthly , volume =

    Sinkhorn, Richard , title =. The American Mathematical Monthly , volume =

  31. [31]

    Interspeech , year=

    Training-Free Voice Conversion with Factorized Optimal Transport , author=. Interspeech , year=

  32. [32]

    IEEE TPAMI , year=

    Optimal Transport for Domain Adaptation , author=. IEEE TPAMI , year=

  33. [33]

    ICASSP , year=

    X-vectors: Robust DNN embeddings for speaker recognition , author=. ICASSP , year=

  34. [34]

    Advances in Neural Information Processing Systems , volume=

    HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis , author=. Advances in Neural Information Processing Systems , volume=

  35. [35]

    International Conference on Learning Representations (ICLR) , year=

    DiffWave: A Versatile Diffusion Model for Audio Synthesis , author=. International Conference on Learning Representations (ICLR) , year=

  36. [36]

    Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) , pages=

    WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching , author=. Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) , pages=

  37. [37]

    2020 , journal=

    Denoising Diffusion Probabilistic Models , author=. 2020 , journal=

  38. [38]

    ICLR , year=

    Score-Based Generative Modeling through Stochastic Differential Equations , author=. ICLR , year=

  39. [39]

    Sitzungsberichte der Preussischen Akademie der Wissenschaften, Physikalisch-mathematische Klasse , year =

    Schr. Sitzungsberichte der Preussischen Akademie der Wissenschaften, Physikalisch-mathematische Klasse , year =

  40. [40]

    Chetrite, Rapha. E. Schr. The European Physical Journal H , volume =

  41. [41]

    Discrete and Continuous Dynamical Systems , year=

    A survey of the Schrödinger problem and some of its connections with optimal transport , author=. Discrete and Continuous Dynamical Systems , year=

  42. [42]

    arXiv preprint arXiv:2106.01357 , year =

    Peluchetti, Stefano , title =. arXiv preprint arXiv:2106.01357 , year =

  43. [43]

    Hitchcock, F. L. , title =. Journal of Mathematics and Physics , volume =. 1941 , doi =

  44. [44]

    , title =

    Kantorovich, Leonid V. , title =. 1939 , publisher =

  45. [45]

    , title =

    Kantorovich, Leonid V. , title =. Management Science , volume =

  46. [46]

    Rubinshtein, G. S. , title =. Vestnik Leningrad University. Mathematics, Mechanics, Astronomy , volume =. 1958 , note =

  47. [47]

    , title =

    Dantzig, George B. , title =

  48. [48]

    A Statistical Learning Perspective on Semi-dual Adversarial Neural Optimal Transport Solvers , author=. Proc. ICLR , year=

  49. [49]

    Optimal Transport

    Villani, C\'. Optimal Transport. Old and New , publisher =. 2009 , OPTkey =

  50. [50]

    Neural Optimal Transport , author=. Proc. ICLR , year=

  51. [51]

    Proceedings of the 37th International Conference on Machine Learning , pages =

    Optimal Transport Mapping via Input Convex Neural Networks , author =. Proceedings of the 37th International Conference on Machine Learning , pages =. 2020 , editor =

  52. [52]

    NeurIPS , year=

    Do Neural Optimal Transport Solvers Work? A Continuous Wasserstein-2 Benchmark , author=. NeurIPS , year=

  53. [53]

    ICML , year=

    Optimal Transport Mapping via Input Convex Neural Networks , author=. ICML , year=

  54. [54]

    arXiv preprint arXiv:1909.13082 , year=

    Wasserstein-2 Generative Networks , author=. arXiv preprint arXiv:1909.13082 , year=

  55. [55]

    SIAM Review , year=

    Semi-dual Regularized Optimal Transport , author=. SIAM Review , year=

  56. [56]

    arXiv , year=

    A Statistical Learning Perspective on Semi-dual Adversarial Neural Optimal Transport Solvers , author=. arXiv , year=

  57. [57]

    arXiv:2112.07275 , year=

    Parameter Tuning and Model Selection in Optimal Transport with Semi-dual Brenier Formulation , author=. arXiv:2112.07275 , year=

  58. [58]

    arXiv:2602.03566 , year=

    Riemannian Neural Optimal Transport , author=. arXiv:2602.03566 , year=

  59. [59]

    arXiv , year=

    Three-Player Wasserstein GAN via Amortised Duality , author=. arXiv , year=

  60. [60]

    arXiv , year=

    Wasserstein GAN with Quadratic Transport Cost , author=. arXiv , year=

  61. [61]

    arXiv preprint arXiv:2106.03812 , year =

    Neural Monge Map Estimation and Its Applications , author =. arXiv preprint arXiv:2106.03812 , year =

  62. [62]

    ICLR , year =

    Generative Modeling through the Semi-dual Formulation of Unbalanced Optimal Transport , author =. ICLR , year =

  63. [63]

    2-Wasserstein Approximation via Restricted Convex Potentials with Application to Improved Training for GANs

    2-wasserstein approximation via restricted convex potentials with application to improved training for gans , author =. arXiv preprint:1902.07197 , year =

  64. [64]

    , title =

    Ambrosio, L. , title =. Optimal Transportation and Applications , series =. 2003 , pages =

  65. [65]

    NerISP , year =

    Overcoming Spurious Solutions in Semi-Dual Neural Optimal Transport: A Smoothing Approach for Learning the Optimal Transport Plan , author =. NerISP , year =

  66. [66]

    Proceedings of the Edinburgh Mathematical Society , author=

    On. Proceedings of the Edinburgh Mathematical Society , author=. 2011 , pages=. doi:10.1017/S001309150800117X , number=

  67. [67]

    The. Bull. Amer. Math. Soc. , author=. 2014 , pages=. doi:https://doi.org/10.1090/S0273-0979-2014-01459-4 , number=

  68. [68]

    (q,p)-Wasserstein GANs: Comparing Ground Metrics for Wasserstein GANs

    (q,p)-Wasserstein GANs: Comparing Ground Metrics for Wasserstein GANs , author=. arXiv preprint arXiv:1902.03642 , year=

  69. [69]

    Acta Math

    The geometry of optimal transportation , volume=. Acta Math. , author=. 1996 , pages=

  70. [70]

    , title =

    Vershik, Anatoly M. , title =. The Mathematical Intelligencer , volume =. 2013 , doi =

  71. [71]

    Annals of Mathematics , volume =

    von Neumann, John , title =. Annals of Mathematics , volume =

  72. [72]

    GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium

    Martin Heusel and Hubert Ramsauer and Thomas Unterthiner and Bernhard Nessler and G. GANs Trained by a Two Time-Scale Update Rule Converge to a Nash Equilibrium , journal =. 2017 , url =. 1706.08500 , timestamp =

  73. [73]

    Minimax estimation of smooth optimal transport maps , journal =

    H\". Minimax estimation of smooth optimal transport maps , journal =

  74. [74]

    and Kolesnikov, Aleksandr V

    Bogachev, Vladimir I. and Kolesnikov, Aleksandr V. , title =. Russian Math. Surveys , year =

  75. [75]

    , title =

    Figalli, Alessio and Kim, Young-Heon and McCann, Robert J. , title =. Archive for Rational Mechanics and Analysis , year =

  76. [76]

    , title =

    Evans, Lawrence C. , title =. Current developments in mathematics. Papers from the conference held in Cambridge, MA, USA, 1997. , year =

  77. [77]

    arXiv preprint arXiv:2201.12220 , year=

    Neural optimal transport , author=. arXiv preprint arXiv:2201.12220 , year=

  78. [78]

    Transactions on Machine Learning Research , pages=

    Neural monge map estimation and its applications , author=. Transactions on Machine Learning Research , pages=

  79. [79]

    arXiv preprint arXiv:2502.01310 , year=

    A Statistical Learning Perspective on Semi-dual Adversarial Neural Optimal Transport Solvers , author=. arXiv preprint arXiv:2502.01310 , year=