Polyak-Lojasiewicz Inequality for Quadratically Regularized Optimal Transport
Pith reviewed 2026-06-29 15:35 UTC · model grok-4.3
The pith
The dual of quadratically regularized optimal transport satisfies a local Polyak-Lojasiewicz inequality near the optimum with explicit constants.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under mild assumptions covering both continuous and semi-discrete transport problems, the dual QOT objective satisfies a local error bound and a Polyak-Lojasiewicz inequality, with explicit constants depending only on the problem primitives. These results are obtained by functional-analytic techniques exploiting that near the optimum, the argument of the positive part function is positive on the interior of the support of the optimal coupling.
What carries the argument
Local positivity of the argument inside the positive part function on the interior of the optimal coupling support, which supplies the quantitative curvature needed for the error bound and PL inequality.
If this is right
- Gradient ascent on the dual converges linearly with an explicit contraction rate that depends only on problem primitives.
- Coordinate ascent on the dual converges linearly with an explicit rate depending only on problem primitives.
- Coordinate gradient ascent on the dual converges linearly with an explicit rate depending only on problem primitives.
- The same curvature constants apply uniformly to both continuous and semi-discrete formulations.
Where Pith is reading between the lines
- Quadratic regularization may therefore be algorithmically comparable to entropic regularization in terms of guaranteed linear rates.
- The explicit constants could be used to select step sizes or stopping tolerances in practical QOT solvers.
- Analogous positivity arguments might extend the PL property to other non-smooth regularizers in optimal transport.
Load-bearing premise
Near the optimum the argument of the positive part function stays positive on the interior of the support of the optimal coupling.
What would settle it
A concrete QOT instance in which the dual objective violates the stated local error bound or in which gradient ascent on the dual exhibits only sublinear convergence.
read the original abstract
Quadratically regularized optimal transport (QOT) is an alternative to entropic regularization that yields sparse couplings and avoids numerical instabilities due to exponential scaling. From an optimization viewpoint, the dual QOT objective is concave but features a positive part function which prevents strong concavity and reduces smoothness of optimizers. Consequently, standard arguments for linear convergence of algorithms do not apply. In this paper, we nevertheless establish a quantitative curvature property for the QOT dual. Under mild assumptions covering both continuous and semi-discrete transport problems, we prove a local error bound and a Polyak-Lojasiewicz (PL) inequality, with explicit constants depending only on the problem primitives. These results are obtained by functional-analytic techniques exploiting that near the optimum, the argument of the positive part function is positive on the interior of the support of the optimal coupling. As applications, we derive linear convergence of the gradient ascent, coordinate ascent, and coordinate gradient ascent algorithms on the dual problem, with explicit contraction rates.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper establishes a local error bound and Polyak-Łojasiewicz inequality for the dual objective of quadratically regularized optimal transport (QOT). Under mild assumptions applicable to both continuous and semi-discrete problems, it derives explicit constants depending only on problem primitives via functional-analytic techniques that exploit positivity of the positive-part argument on the interior support of the optimal coupling near the optimum. This yields linear convergence guarantees (with explicit rates) for gradient ascent, coordinate ascent, and coordinate gradient ascent on the dual.
Significance. If the local positivity property holds under the stated mild assumptions, the result supplies the first explicit curvature analysis for the non-strongly-concave QOT dual, enabling rigorous linear-rate guarantees for first-order methods in a setting where entropic regularization is avoided. The explicit dependence of constants solely on primitives is a clear strength.
major comments (1)
- [Abstract] Abstract (and the functional-analytic argument described therein): the derivation of the local error bound and PL inequality with explicit constants rests on the claim that, near the optimum, the argument of the positive-part function is positive on the interior of the support of the optimal coupling. The paper presents this as following from the mild assumptions covering semi-discrete cases, yet atoms or boundary mass in the optimal plan could violate interior positivity; without an explicit verification or additional hypothesis in the main argument, the claimed constants and local curvature do not necessarily follow.
Simulated Author's Rebuttal
We thank the referee for their careful reading and for identifying a point that requires clarification in the justification of the interior positivity property. We address the major comment below and will strengthen the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract (and the functional-analytic argument described therein): the derivation of the local error bound and PL inequality with explicit constants rests on the claim that, near the optimum, the argument of the positive-part function is positive on the interior of the support of the optimal coupling. The paper presents this as following from the mild assumptions covering semi-discrete cases, yet atoms or boundary mass in the optimal plan could violate interior positivity; without an explicit verification or additional hypothesis in the main argument, the claimed constants and local curvature do not necessarily follow.
Authors: We agree that the current presentation does not contain an explicit verification that the mild assumptions suffice to guarantee the required interior positivity near the optimum, particularly when the optimal plan may involve atoms (as in semi-discrete settings) or boundary mass. While the assumptions are chosen to cover both continuous and semi-discrete problems, an additional lemma or remark making this step rigorous would eliminate any ambiguity in the derivation of the explicit constants. In the revised manuscript we will insert such a verification (based on the problem primitives and the definition of the support) immediately before the main functional-analytic argument, thereby confirming that the local error bound and PL inequality hold with the stated constants under the given hypotheses. revision: yes
Circularity Check
No circularity: derivation uses independent functional analysis under stated assumptions
full rationale
The paper proves a local error bound and PL inequality for the dual QOT objective via functional-analytic techniques that exploit positivity of the positive-part argument near the optimum on the interior support of the optimal coupling. This positivity is presented as following from the mild assumptions covering continuous and semi-discrete problems, not as a self-referential definition or fitted input. No equations reduce by construction to prior results via self-citation chains, uniqueness theorems imported from the same authors, or ansatzes smuggled through citations. The explicit constants depend only on problem primitives, and the derivation chain is self-contained without renaming known empirical patterns or calling fitted quantities predictions. This is the normal case of a proof paper whose central claims rest on external mathematical techniques rather than internal reduction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Mild assumptions covering continuous and semi-discrete transport problems
Forward citations
Cited by 2 Pith papers
-
Finite-sample bounds for regularized optimal transport
Establishes explicit finite-sample bias and variance bounds for regularized OT costs that improve prior entropic results, deliver the first quantitative bounds for L^p regularization, and yield an n^{-2/(d+4)} rate fo...
-
Stability of Quadratically Regularized Optimal Transport
Establishes L^∞-stability of dual potentials in QOT, yielding local Lipschitz stability of the optimal coupling support in Hausdorff distance for quadratic cost under marginal perturbations.
Reference graph
Works this paper leans on
-
[1]
Bayraktar, S
E. Bayraktar, S. Eckstein, and X. Zhang. Stability and sample complexity of divergence regu- larized optimal transport.Bernoulli, 31(1):213–239, 2025
2025
-
[2]
Blondel, V
M. Blondel, V. Seguy, and A. Rolet. Smooth and sparse optimal transport. volume 84 of Proceedings of Machine Learning Research, pages 880–889, 2018
2018
-
[3]
J. F. Bonnans and A. Shapiro.Perturbation analysis of optimization problems. Springer Series in Operations Research. Springer-Verlag, New York, 2000
2000
-
[4]
Bourgain, H
J. Bourgain, H. Brezis, and P. Mironescu. Another look at Sobolev spaces. InOptimal Control and Partial Differential Equations, pages 439–455. IOS, Amsterdam, 2001
2001
-
[5]
Brezis.Functional analysis, Sobolev spaces and partial differential equations
H. Brezis.Functional analysis, Sobolev spaces and partial differential equations. Universitext. Springer, New York, 2011
2011
-
[6]
G. Carlier. On the linear convergence of the multimarginal Sinkhorn algorithm.SIAM J. Optim., 32(2):786–794, 2022
2022
-
[7]
M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Weinberger, editors,Advances in Neural Infor- mation Processing Systems, volume 26. Curran Associates, Inc., 2013
2013
-
[8]
Eckstein and M
S. Eckstein and M. Kupper. Computation of optimal transport and related hedging problems via penalization and neural networks.Appl. Math. Optim., 83(2):639–667, 2021
2021
-
[9]
Eckstein and M
S. Eckstein and M. Nutz. Convergence rates for regularized optimal transport via quantization. Math. Oper. Res., 49(2):1223–1240, 2024
2024
-
[10]
Essid and J
M. Essid and J. Solomon. Quadratically regularized optimal transport on graphs.SIAM J. Sci. Comput., 40(4):A1961–A1986, 2018
2018
-
[11]
G. B. Folland.Real analysis. Pure and Applied Mathematics. John Wiley & Sons, New York, second edition, 1999
1999
-
[12]
Genevay, M
A. Genevay, M. Cuturi, G. Peyré, and F. Bach. Stochastic optimization for large-scale optimal transport. InAdvances in Neural Information Processing Systems 29, pages 3440–3448, 2016
2016
-
[13]
Sample complexity of quadratically regularized optimal transport
A. González-Sanz, E. del Barrio, and M. Nutz. Sample complexity of quadratically regularized optimal transport.arXiv:2511.09807, 2025. 31
-
[14]
A. González-Sanz, S. Eckstein, and M. Nutz. Sparse regularized optimal transport without curse of dimensionality.arXiv:2505.04721, 2025
-
[15]
González-Sanz and M
A. González-Sanz and M. Nutz. Stability of quadratically regularized optimal transport.In preparation, 2026
2026
-
[16]
Linear Convergence of Gradient Descent for Quadratically Regularized Optimal Transport
A. González-Sanz, M. Nutz, and A. Riveros Valdevenito. Linear convergence of gradient descent for quadratically regularized optimal transport.arXiv:2509.08547, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[17]
A. González-Sanz and M. Nutz. Sparsity of quadratically regularized optimal transport: Scalar case.SIAM J. Math. Anal., forthcoming, 2024. Preprint arXiv:2410.03353
-
[18]
Grisvard.Elliptic Problems in Nonsmooth Domains, volume 24 ofMonographs and Studies in Mathematics
P. Grisvard.Elliptic Problems in Nonsmooth Domains, volume 24 ofMonographs and Studies in Mathematics. Pitman, Boston, 1985
1985
-
[19]
Gulrajani, F
I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville. Improved training of Wasserstein GANs. InProceedings of the 31st International Conference on Neural Information Processing Systems, pages 5769–5779, 2017
2017
-
[20]
L. Li, A. Genevay, M. Yurochkin, and J. Solomon. Continuous regularized Wasserstein barycen- ters. In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, editors,Advances in Neural Information Processing Systems, volume 33, pages 17755–17765. Curran Associates, Inc., 2020
2020
- [21]
-
[22]
D. A. Lorenz and H. Mahler. Orlicz space regularization of continuous optimal transport prob- lems.Appl. Math. Optim., 85(2):Paper No. 14, 33, 2022
2022
-
[23]
D. A. Lorenz, P. Manns, and C. Meyer. Quadratically regularized optimal transport.Appl. Math. Optim., 83(3):1919–1949, 2021
1919
-
[24]
Muzellec, R
B. Muzellec, R. Nock, G. Patrini, and F. Nielsen. Tsallis regularized optimal transport and ecological inference.Proceedings of the AAAI Conference on Artificial Intelligence, 31(1), 2017
2017
-
[25]
M. Nutz. Introduction to entropic optimal transport, 2021. Lecture notes, Columbia University, https://www.math.columbia.edu/~mnutz/docs/EOT_lecture_notes.pdf
2021
-
[26]
M. Nutz. Quadratically regularized optimal transport: existence and multiplicity of potentials. SIAM J. Math. Anal., 57(3):2622–2649, 2025
2025
-
[27]
Peyré and M
G. Peyré and M. Cuturi. Computational optimal transport: With applications to data science. Foundations and Trends®in Machine Learning, 11(5–6):355–607, 2019
2019
-
[28]
Rüschendorf and W
L. Rüschendorf and W. Thomsen. Note on the Schrödinger equation andI-projections.Statist. Probab. Lett., 17(5):369–375, 1993
1993
-
[29]
Schmitzer
B. Schmitzer. Stabilized sparse scaling algorithms for entropy regularized transport problems. SIAM J. Sci. Comput., 41(3):A1443–A1481, 2019
2019
-
[30]
Seguy, B
V. Seguy, B. B. Damodaran, R. Flamary, N. Courty, A. Rolet, and M. Blondel. Large scale optimal transport and mapping estimation. InInternational Conference on Learning Represen- tations, 2018
2018
- [31]
-
[32]
Wiesel and X
J. Wiesel and X. Xu. Sparsity of quadratically regularized optimal transport: Bounds on con- centration and bias.SIAM J. Math. Anal., 57(6):6498–6521, 2025
2025
- [33]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.