Accelerating operator Sinkhorn iteration with overrelaxation

Andr\'e Uschmajew; Tasuku Soma

arxiv: 2410.14104 · v2 · submitted 2024-10-18 · 🧮 math.OC · cs.NA· math.NA

Accelerating operator Sinkhorn iteration with overrelaxation

Tasuku Soma , Andr\'e Uschmajew This is my paper

Pith reviewed 2026-05-23 18:59 UTC · model grok-4.3

classification 🧮 math.OC cs.NAmath.NA

keywords operator scalingSinkhorn iterationsuccessive overrelaxationHilbert metricpositive definite operatorsconvergence analysis

0 comments

The pith

Successive overrelaxation accelerates the operator Sinkhorn iteration for scaling positive definite operators, with local rate optimization via linearization and global convergence via the Hilbert metric.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops accelerated versions of the operator Sinkhorn iteration by incorporating successive overrelaxation. Local convergence is analyzed by linearizing the iteration around its fixed point, which identifies the best relaxation parameter according to Young's theorem on successive overrelaxation. A geodesic form of the overrelaxed iteration is shown to converge globally on the cone of positive definite operators for relaxation parameters in a specific interval, using contraction properties of the Hilbert metric. These results extend analogous acceleration methods that were previously known only for ordinary matrix scaling. Numerical tests confirm faster performance than the standard iteration in selected applications.

Core claim

Applying successive overrelaxation to the operator Sinkhorn map yields accelerated iterations whose asymptotic convergence rate is governed by the spectral radius of the linearized map at the fixed point; the optimal parameter follows from Young's SOR theorem. In addition, a geodesic overrelaxation variant converges globally whenever the relaxation parameter lies between one and two, because the Hilbert metric contracts distances on the positive definite cone under that condition.

What carries the argument

The overrelaxed operator Sinkhorn iteration map, whose local behavior is captured by its Fréchet derivative at the fixed point and whose global behavior is controlled by the Hilbert metric on the positive definite cone.

Load-bearing premise

The linearization of the iteration map at the fixed point correctly predicts the long-term convergence speed, and the geodesic overrelaxation iteration remains contractive in the Hilbert metric for the chosen parameter values.

What would settle it

An explicit computation of the convergence rate for a low-dimensional operator scaling problem where the observed rate after applying the optimal relaxation parameter differs from the rate given by the spectral radius of the linearized map.

Figures

Figures reproduced from arXiv: 2410.14104 by Andr\'e Uschmajew, Tasuku Soma.

**Figure 2.** Figure 2: Experimental results for ill-conditioned operators. ( [PITH_FULL_IMAGE:figures/full_fig_p020_2.png] view at source ↗

read the original abstract

We propose accelerated versions of the operator Sinkhorn iteration for operator scaling using successive overrelaxation. We analyze the local convergence rates of these accelerated methods via linearization, which allows us to determine the asymptotically optimal relaxation parameter based on Young's SOR theorem. Using the Hilbert metric on positive definite cones, we also obtain a global convergence result for a geodesic version of overrelaxation in a specific range of relaxation parameters. These techniques generalize corresponding results obtained for matrix scaling by Thibault et al. (Algorithms, 14(5):143, 2021) and Lehmann et al. (Optim. Lett., 16(8):2209--2220, 2022). Numerical experiments demonstrate that the proposed methods outperform the original operator Sinkhorn iteration in certain applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper extends SOR acceleration and its analysis from matrix Sinkhorn to the operator setting with local linearization rates and a Hilbert-metric global result, but the step invoking Young's theorem on the linearized operator map needs explicit verification that the required spectral conditions hold.

read the letter

The main takeaway is that they lift the overrelaxation approach from the matrix scaling papers to positive definite operators. They linearize the iteration map at the fixed point to get asymptotic rates, pick the relaxation parameter via Young's SOR theorem, and add a global convergence guarantee for a geodesic version using the Hilbert metric on the cone. This is not just a restatement; the operator case requires adapting the tools, and they do that explicitly rather than assuming it carries over unchanged. The experiments are presented as showing practical speedups in some applications, which matches the goal of accelerating a standard method. That part is useful for anyone already running operator Sinkhorn iterations. The soft spot is the local-rate claim. Young's theorem needs the linearized operator to satisfy property A or consistent ordering for the optimal parameter formula to be valid. Matrices make this automatic, but the operator setting on a cone does not, and the abstract gives no sign they checked the spectrum or ordering separately. If that verification is missing or incomplete, the claimed acceleration rate does not follow directly. The global result looks more robust because it rests on the Hilbert metric, which is a standard tool here. This paper is for people working on numerical methods for operator scaling or related cone-constrained problems in optimization. A reader familiar with the matrix versions will find the extension straightforward to follow. It is worth sending to referees because the generalization is real and the analysis uses classical tools in a plausible way, even if one section needs tighter justification.

Referee Report

1 major / 2 minor

Summary. The paper proposes accelerated versions of the operator Sinkhorn iteration for operator scaling problems by incorporating successive overrelaxation (SOR). Local convergence rates of the accelerated iterations are analyzed by linearizing the nonlinear map at the fixed point and invoking Young's SOR theorem to identify the asymptotically optimal relaxation parameter. A global convergence guarantee is derived for a geodesic variant of overrelaxation via the Hilbert metric on the cone of positive definite operators, valid for a specific parameter range. The results generalize corresponding matrix-scaling accelerations from Thibault et al. (2021) and Lehmann et al. (2022). Numerical experiments indicate faster performance than the unaccelerated operator Sinkhorn iteration in selected applications.

Significance. If the central claims hold, the manuscript supplies a practical and theoretically supported acceleration technique for operator scaling, extending finite-matrix SOR results to the infinite-dimensional operator setting while retaining both local asymptotic rates and global convergence via the Hilbert metric. The explicit use of linearization to recover an optimal relaxation parameter and the geodesic formulation are technically natural extensions; the numerical validation provides empirical evidence of utility. Such accelerations could benefit applications involving positive semidefinite operators, such as quantum marginal problems or semidefinite programming.

major comments (1)

[Local convergence analysis] Local convergence section: the derivation of the asymptotically optimal relaxation parameter rests on linearizing the operator Sinkhorn map to obtain a linear operator L and then directly applying Young's SOR theorem. The manuscript does not verify that L satisfies the structural hypotheses of the theorem (consistent ordering, property A, or equivalent spectral conditions) when L acts on the space of positive definite operators; these conditions hold automatically for finite matrices but require separate justification in the operator setting. Without this verification the claimed optimality of the derived ω* does not follow.

minor comments (2)

[Abstract] The abstract states that the methods 'outperform the original operator Sinkhorn iteration in certain applications'; naming the concrete applications (e.g., quantum information, covariance estimation) would improve readability.
[Introduction] Notation for the relaxation parameter ω and the geodesic step should be introduced with a short reminder of the matrix-case definitions before the operator generalization is stated.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and constructive feedback. We address the single major comment below.

read point-by-point responses

Referee: [Local convergence analysis] Local convergence section: the derivation of the asymptotically optimal relaxation parameter rests on linearizing the operator Sinkhorn map to obtain a linear operator L and then directly applying Young's SOR theorem. The manuscript does not verify that L satisfies the structural hypotheses of the theorem (consistent ordering, property A, or equivalent spectral conditions) when L acts on the space of positive definite operators; these conditions hold automatically for finite matrices but require separate justification in the operator setting. Without this verification the claimed optimality of the derived ω* does not follow.

Authors: We agree that the manuscript does not explicitly verify that the linearized operator L satisfies the structural hypotheses (property A, consistent ordering, or equivalent spectral conditions) required by Young's SOR theorem in the infinite-dimensional setting. This verification is indeed needed to rigorously justify the optimality of ω*. In the revised manuscript we will add a dedicated paragraph (or short appendix) establishing that L inherits these properties from the structure of the operator Sinkhorn map: the linearization is a positive linear operator on the tangent space that preserves the block-nonnegativity pattern induced by the scaling, allowing the same combinatorial arguments used in the matrix case to carry over. This will confirm applicability of the theorem without altering the stated convergence rates. revision: yes

Circularity Check

0 steps flagged

No circularity; relies on external theorems and independent prior matrix results

full rationale

The paper linearizes the operator Sinkhorn map to obtain a linear operator and invokes Young's SOR theorem (a classical external result) to select the optimal relaxation parameter; it also uses the Hilbert metric for a global convergence argument in a stated parameter range. These steps generalize independent results from Thibault et al. (2021) and Lehmann et al. (2022) on the matrix case without self-citation chains or load-bearing internal citations. No parameters are fitted to data and renamed as predictions, no quantities are defined in terms of themselves, and no ansatz is smuggled via self-citation. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The abstract invokes standard domain assumptions from optimization and geometry; no new entities are introduced and the relaxation parameter is derived from classical SOR theory rather than fitted ad hoc.

free parameters (1)

relaxation parameter
Chosen via Young's SOR theorem applied to the linearized map; its specific value depends on spectral properties of the iteration.

axioms (1)

domain assumption The Hilbert metric on the positive definite cone supports a global convergence argument for geodesic overrelaxation within a stated parameter interval
Invoked to obtain the global convergence result

pith-pipeline@v0.9.0 · 5660 in / 1209 out tokens · 30801 ms · 2026-05-23T18:59:13.389851+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages · 2 internal anchors

[1]

Allen-Zhu, A

Z. Allen-Zhu, A. Garg, Y. Li, R. Oliveira, and A. Wigderson. Operator scaling via geodesically convex optimization, invariant theory and polynomial identity testing. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing , pages 172--181. Association for Computing Machinery, New York, 2018

work page 2018
[2]

Ba c \'ak

M. Ba c \'ak. Convex analysis and optimization in H adamard spaces . De Gruyter, Berlin, 2014

work page 2014
[3]

F. Barthe. On a reverse form of the B rascamp- L ieb inequality. Invent. Math. , 134(2):335--361, 1998

work page 1998
[4]

R. Bhatia. Positive definite matrices . Princeton University Press, Princeton, NJ, 2007

work page 2007
[5]

Burer and R

S. Burer and R. D. C. Monteiro. Local minima and convergence in low-rank semidefinite programming. Math. Program. , 103(3, Ser. A):427--444, 2005

work page 2005
[6]

N. Boumal. An introduction to optimization on smooth manifolds . Cambridge University Press, Cambridge, 2023

work page 2023
[7]

Boyd and L

S. Boyd and L. Vandenberghe. Convex optimization . Cambridge University Press, Cambridge, 2004

work page 2004
[8]

Drton, S

M. Drton, S. Kuriki, and P. Hoff. Existence and uniqueness of the K ronecker covariance MLE . Ann. Statist. , 49(5):2721--2754, 2021

work page 2021
[9]

Geodesic Convexity and Regularized Scatter Estimators

L. Duembgen and D. E. Tyler. Geodesic convexity and regularized scatter estimators. arxiv:1607.05455 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[10]

S. P. Eveson and R. D. Nussbaum. An elementary proof of the B irkhoff- H opf theorem. Math. Proc. Cambridge Philos. Soc. , 117(1):31--55, 1995

work page 1995
[11]

Franklin and J

J. Franklin and J. Lorenz. On the scaling of multidimensional matrices. Linear Algebra Appl. , 114/115:717--735, 1989

work page 1989
[12]

W. C. Franks and A. Moitra. Rigorous guarantees for Tyler ’s M -estimator via quantum expansion. In Proceedings of the 33rd Conference on Learning Theory , pages 1601--1632. PMLR, 2020

work page 2020
[13]

A. Garg, L. Gurvits, R. Oliveira, and A. Wigderson. Algorithmic and optimization aspects of B rascamp- L ieb inequalities, via operator scaling. Geom. Funct. Anal. , 28(1):100--145, 2018

work page 2018
[14]

A. Garg, L. Gurvits, R. Oliveira, and A. Wigderson. Operator scaling: theory and applications. Found. Comput. Math. , 20(2):223--290, 2020

work page 2020
[15]

T. T. Georgiou and M. Pavon. Positive contraction mappings for classical and quantum S chr\"odinger systems. J. Math. Phys. , 56(3):033301, 24, 2015

work page 2015
[16]

L. Gurvits. Classical complexity and quantum entanglement. J. Comput. System Sci. , 69(3):448--484, 2004

work page 2004
[17]

Gunawardena and C

J. Gunawardena and C. Walsh. Iterates of maps which are non-expansive in H ilbert's projective metric. Kybernetika (Prague) , 39(2):193--204, 2003

work page 2003
[18]

Hackbusch

W. Hackbusch. Iterative solution of large sparse systems of equations . Springer, Cham, second edition, 2016

work page 2016
[19]

N. J. Higham. Functions of matrices . Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2008

work page 2008
[20]

L. A. Hageman and T. A. Porsching. Aspects of nonlinear block successive overrelaxation. SIAM J. Numer. Anal. , 12:316--335, 1975

work page 1975
[21]

R. B. Holmes and V. I. Paulsen. Optimal frames for erasures. Linear Algebra Appl. , 377:31--51, 2004

work page 2004
[22]

M. Idel. A review of matrix scaling and S inkhorn's normal form for matrices and positive maps. arxiv:1609.06349 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[23]

T. C. Kwok, L. C. Lau, Y. T. Lee, and A. Ramachandran. The paulsen problem, continuous operator scaling, and smoothed analysis. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing , pages 182--189. Association for Computing Machinery, New York, 2018

work page 2018
[24]

Lemmens and R

B. Lemmens and R. Nussbaum. Nonlinear P erron- F robenius theory . Cambridge University Press, Cambridge, 2012

work page 2012
[25]

Lehmann, M.-K

T. Lehmann, M.-K. von Renesse, A. Sambale, and A. Uschmajew. A note on overrelaxation in the S inkhorn algorithm. Optim. Lett. , 16(8):2209--2220, 2022

work page 2022
[26]

J. M. Ortega and W. C. Rheinboldt. Iterative solution of nonlinear equations in several variables . Academic Press, New York-London, 1970

work page 1970
[27]

I. V. Oseledets, M. V. Rakhuba, and A. Uschmajew. Local convergence of alternating low-rank optimization methods with overrelaxation. Numer. Linear Algebra Appl. , 30(3):e2459, 15, 2023

work page 2023
[28]

Peyr\'e, L

G. Peyr\'e, L. Chizat, F.-X. Vialard, and J. Solomon. Quantum entropic regularization of matrix-valued optimal transport. European J. Appl. Math. , 30(6):1079--1102, 2019

work page 2019
[29]

Schechter

S. Schechter. Iteration methods for nonlinear problems. Trans. Amer. Math. Soc. , 104:179--189, 1962

work page 1962
[30]

Sra and R

S. Sra and R. Hosseini. Geometric optimisation on positive definite matrices for elliptically contoured distributions. In Advances in Neural Information Processing Systems 26 , pages 2562--2570. Curran Associates, Inc., 2013

work page 2013
[31]

Sra and R

S. Sra and R. Hosseini. Conic geometric optimization on the manifold of positive definite matrices. SIAM Journal on Optimization , 25(1):713--739, 2015

work page 2015
[32]

Sinkhorn and P

R. Sinkhorn and P. Knopp. Concerning nonnegative matrices and doubly stochastic matrices. Pacific J. Math. , 21:343--348, 1967

work page 1967
[33]

S. Sra, N. K. Vishnoi, and O. Yildiz. On geodesically convex formulations for the Brascamp-Lieb constant. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018) , pages 25:1--25:15. Schloss Dagstuhl -- Leibniz-Zentrum f \"u r Informatik, Dagstuhl, 2018

work page 2018
[34]

Thibault, L

A. Thibault, L. Chizat, C. Dossal, and N. Papadakis. Overrelaxed S inkhorn- K nopp algorithm for regularized optimal transport. Algorithms (Basel) , 14(5):Paper No. 143, 16, 2021

work page 2021
[35]

D. E. Tyler. A distribution-free M -estimator of multivariate scatter. Ann. Statist. , 15(1):234--251, 1987

work page 1987
[36]

Udri s te

C. Udri s te. Convex functions and optimization methods on R iemannian manifolds . Kluwer Academic Publishers Group, Dordrecht, 1994

work page 1994
[37]

Weber and S

M. Weber and S. Sra. Global optimality for E uclidean CCCP under R iemannian convexity. In Proceedings of the 40th International Conference on Machine Learning , pages 36790--36803. PMLR, 2023

work page 2023

[1] [1]

Allen-Zhu, A

Z. Allen-Zhu, A. Garg, Y. Li, R. Oliveira, and A. Wigderson. Operator scaling via geodesically convex optimization, invariant theory and polynomial identity testing. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing , pages 172--181. Association for Computing Machinery, New York, 2018

work page 2018

[2] [2]

Ba c \'ak

M. Ba c \'ak. Convex analysis and optimization in H adamard spaces . De Gruyter, Berlin, 2014

work page 2014

[3] [3]

F. Barthe. On a reverse form of the B rascamp- L ieb inequality. Invent. Math. , 134(2):335--361, 1998

work page 1998

[4] [4]

R. Bhatia. Positive definite matrices . Princeton University Press, Princeton, NJ, 2007

work page 2007

[5] [5]

Burer and R

S. Burer and R. D. C. Monteiro. Local minima and convergence in low-rank semidefinite programming. Math. Program. , 103(3, Ser. A):427--444, 2005

work page 2005

[6] [6]

N. Boumal. An introduction to optimization on smooth manifolds . Cambridge University Press, Cambridge, 2023

work page 2023

[7] [7]

Boyd and L

S. Boyd and L. Vandenberghe. Convex optimization . Cambridge University Press, Cambridge, 2004

work page 2004

[8] [8]

Drton, S

M. Drton, S. Kuriki, and P. Hoff. Existence and uniqueness of the K ronecker covariance MLE . Ann. Statist. , 49(5):2721--2754, 2021

work page 2021

[9] [9]

Geodesic Convexity and Regularized Scatter Estimators

L. Duembgen and D. E. Tyler. Geodesic convexity and regularized scatter estimators. arxiv:1607.05455 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[10] [10]

S. P. Eveson and R. D. Nussbaum. An elementary proof of the B irkhoff- H opf theorem. Math. Proc. Cambridge Philos. Soc. , 117(1):31--55, 1995

work page 1995

[11] [11]

Franklin and J

J. Franklin and J. Lorenz. On the scaling of multidimensional matrices. Linear Algebra Appl. , 114/115:717--735, 1989

work page 1989

[12] [12]

W. C. Franks and A. Moitra. Rigorous guarantees for Tyler ’s M -estimator via quantum expansion. In Proceedings of the 33rd Conference on Learning Theory , pages 1601--1632. PMLR, 2020

work page 2020

[13] [13]

A. Garg, L. Gurvits, R. Oliveira, and A. Wigderson. Algorithmic and optimization aspects of B rascamp- L ieb inequalities, via operator scaling. Geom. Funct. Anal. , 28(1):100--145, 2018

work page 2018

[14] [14]

A. Garg, L. Gurvits, R. Oliveira, and A. Wigderson. Operator scaling: theory and applications. Found. Comput. Math. , 20(2):223--290, 2020

work page 2020

[15] [15]

T. T. Georgiou and M. Pavon. Positive contraction mappings for classical and quantum S chr\"odinger systems. J. Math. Phys. , 56(3):033301, 24, 2015

work page 2015

[16] [16]

L. Gurvits. Classical complexity and quantum entanglement. J. Comput. System Sci. , 69(3):448--484, 2004

work page 2004

[17] [17]

Gunawardena and C

J. Gunawardena and C. Walsh. Iterates of maps which are non-expansive in H ilbert's projective metric. Kybernetika (Prague) , 39(2):193--204, 2003

work page 2003

[18] [18]

Hackbusch

W. Hackbusch. Iterative solution of large sparse systems of equations . Springer, Cham, second edition, 2016

work page 2016

[19] [19]

N. J. Higham. Functions of matrices . Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2008

work page 2008

[20] [20]

L. A. Hageman and T. A. Porsching. Aspects of nonlinear block successive overrelaxation. SIAM J. Numer. Anal. , 12:316--335, 1975

work page 1975

[21] [21]

R. B. Holmes and V. I. Paulsen. Optimal frames for erasures. Linear Algebra Appl. , 377:31--51, 2004

work page 2004

[22] [22]

M. Idel. A review of matrix scaling and S inkhorn's normal form for matrices and positive maps. arxiv:1609.06349 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[23] [23]

T. C. Kwok, L. C. Lau, Y. T. Lee, and A. Ramachandran. The paulsen problem, continuous operator scaling, and smoothed analysis. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing , pages 182--189. Association for Computing Machinery, New York, 2018

work page 2018

[24] [24]

Lemmens and R

B. Lemmens and R. Nussbaum. Nonlinear P erron- F robenius theory . Cambridge University Press, Cambridge, 2012

work page 2012

[25] [25]

Lehmann, M.-K

T. Lehmann, M.-K. von Renesse, A. Sambale, and A. Uschmajew. A note on overrelaxation in the S inkhorn algorithm. Optim. Lett. , 16(8):2209--2220, 2022

work page 2022

[26] [26]

J. M. Ortega and W. C. Rheinboldt. Iterative solution of nonlinear equations in several variables . Academic Press, New York-London, 1970

work page 1970

[27] [27]

I. V. Oseledets, M. V. Rakhuba, and A. Uschmajew. Local convergence of alternating low-rank optimization methods with overrelaxation. Numer. Linear Algebra Appl. , 30(3):e2459, 15, 2023

work page 2023

[28] [28]

Peyr\'e, L

G. Peyr\'e, L. Chizat, F.-X. Vialard, and J. Solomon. Quantum entropic regularization of matrix-valued optimal transport. European J. Appl. Math. , 30(6):1079--1102, 2019

work page 2019

[29] [29]

Schechter

S. Schechter. Iteration methods for nonlinear problems. Trans. Amer. Math. Soc. , 104:179--189, 1962

work page 1962

[30] [30]

Sra and R

S. Sra and R. Hosseini. Geometric optimisation on positive definite matrices for elliptically contoured distributions. In Advances in Neural Information Processing Systems 26 , pages 2562--2570. Curran Associates, Inc., 2013

work page 2013

[31] [31]

Sra and R

S. Sra and R. Hosseini. Conic geometric optimization on the manifold of positive definite matrices. SIAM Journal on Optimization , 25(1):713--739, 2015

work page 2015

[32] [32]

Sinkhorn and P

R. Sinkhorn and P. Knopp. Concerning nonnegative matrices and doubly stochastic matrices. Pacific J. Math. , 21:343--348, 1967

work page 1967

[33] [33]

S. Sra, N. K. Vishnoi, and O. Yildiz. On geodesically convex formulations for the Brascamp-Lieb constant. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018) , pages 25:1--25:15. Schloss Dagstuhl -- Leibniz-Zentrum f \"u r Informatik, Dagstuhl, 2018

work page 2018

[34] [34]

Thibault, L

A. Thibault, L. Chizat, C. Dossal, and N. Papadakis. Overrelaxed S inkhorn- K nopp algorithm for regularized optimal transport. Algorithms (Basel) , 14(5):Paper No. 143, 16, 2021

work page 2021

[35] [35]

D. E. Tyler. A distribution-free M -estimator of multivariate scatter. Ann. Statist. , 15(1):234--251, 1987

work page 1987

[36] [36]

Udri s te

C. Udri s te. Convex functions and optimization methods on R iemannian manifolds . Kluwer Academic Publishers Group, Dordrecht, 1994

work page 1994

[37] [37]

Weber and S

M. Weber and S. Sra. Global optimality for E uclidean CCCP under R iemannian convexity. In Proceedings of the 40th International Conference on Machine Learning , pages 36790--36803. PMLR, 2023

work page 2023