pith. machine review for the scientific record. sign in

arxiv: 2605.02279 · v1 · submitted 2026-05-04 · 🧮 math.DG · cs.LG· cs.NA· math.NA· math.OC

Recognition: 3 theorem links

· Lean Theorem

Foundations of Riemannian Geometry for Riemannian Optimization: A Monograph with Detailed Derivations

Benyamin Ghojogh

Pith reviewed 2026-05-08 18:18 UTC · model grok-4.3

classification 🧮 math.DG cs.LGcs.NAmath.NAmath.OC
keywords Riemannian geometryRiemannian optimizationmatrix manifoldsStiefel manifoldGrassmann manifoldsymmetric positive definite manifoldLevi-Civita connectiongeodesics
0
0 comments X

The pith

A monograph supplies explicit coordinate derivations of Riemannian geometry for direct use in optimization on matrix manifolds.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This monograph assembles the foundations of Riemannian geometry into a self-contained reference that includes step-by-step derivations of tangent and cotangent spaces, tensor calculus, metric tensors, Levi-Civita connections, curvature, and geodesics. These constructions are carried forward to obtain ready-to-use expressions for the Riemannian gradient, Hessian, exponential map, and retraction. The same derivations are then specialized to the Stiefel, Grassmann, and symmetric positive definite manifolds. A reader who needs to code manifold-based algorithms receives concrete matrix formulas instead of abstract theorems that must be re-derived for each implementation.

Core claim

The monograph establishes a unified treatment by systematically deriving the tangent and cotangent spaces, tensor calculus, metric tensors, Levi-Civita connections, curvature tensors, and geodesics in coordinates and matrix form. It then obtains explicit formulas for the Riemannian gradient, Hessian, exponential map, and retraction, and supplies the corresponding closed-form expressions on the Stiefel, Grassmann, and SPD manifolds.

What carries the argument

The coordinate-level derivation of the Levi-Civita connection and curvature on matrix manifolds, which produces the explicit Riemannian gradient and Hessian operators needed for numerical optimization.

Load-bearing premise

Classical results from Riemannian geometry can be specialized to the Stiefel, Grassmann, and SPD manifolds through explicit, error-free coordinate calculations that match the needs of numerical optimization without omissions.

What would settle it

Numerical verification that the derived expression for the Riemannian Hessian on the SPD manifold reproduces the known closed-form result from the literature or satisfies the defining properties of the Hessian operator on sample points.

Figures

Figures reproduced from arXiv: 2605.02279 by Benyamin Ghojogh.

Figure 1
Figure 1. Figure 1: Topology: (a) multiple open sets, (b) a finite intersection of some of the open sets is also an open set, and (c) union of some of the open sets is also an open set. Intuitively, the points of a Hausdorff topological space are separable and distinguishable. Equation (1) means that the two points x and y have neighborhoods or open sets U and V which do not overlap (see view at source ↗
Figure 2
Figure 2. Figure 2: A Hausdorff topological space where the points are dis￾tinguishable. 2.2. Homeomorphism and Diffeomorphism Definition 3 (Isomorphism). An isomorphism is a bijec￾tive mapping between two mathematical structures that preserves the relevant structure of the objects. In other words, if two objects are isomorphic, they are considered equivalent from the viewpoint of the structure being stud￾ied, even though the… view at source ↗
Figure 3
Figure 3. Figure 3: Showing that a cup and a doughnut (genus-1 torus) are homeomorphic by molding the cup gradually to a doughnut. First, we fill up the cup by increasing the height of its bottom. Then, we shrink the cup except its handle. Then, we mold the cup to become a clean doughnut. The hole in the doughnut corresponds to the hole in the handle of the cup. Remark 1 (Difference between isomorphism and homeo￾morphism). Th… view at source ↗
Figure 4
Figure 4. Figure 4: Some examples of S 1 , S 2 , B 2 , and B 3 . The S 1 , S 2 , B 2 , and B 3 are locally one-dimensional (embedded in 2D), two￾dimensional (embedded in 3D), two-dimensional (embedded in 2D), and three-dimensional (embedded in 3D), respectively. Definition 10 (Chart (Lee, 2010)). Consider a topological manifold M := (X , τ ). It is locally homeomorphic to R n, meaning that for all x ∈ X, there exists an open … view at source ↗
Figure 7
Figure 7. Figure 7: Two charts (U, φ) and (V, ψ) are smoothly compatible if the mapping ψ ◦ φ −1 is a diffeomorphism. Note that the map￾ping (ψ◦φ −1 )(U ∩V ) maps φ(U ∩V ) back from R n to the man￾ifold M and then maps from the manifold M to R n in ψ(U ∩V ). Definition 12 (Maximal atlas (Lee, 2013)). A smooth atlas A for a topological n-manifold M is maximal if it is not contained in any other smooth atlas for M. 2.5. Smooth … view at source ↗
Figure 5
Figure 5. Figure 5: A chart (U, φ) on the manifold M. The mapping φ(U) approximates the open set U locally to a local flat Euclidean space R n . In other words, φ(U) and the flat Euclidean space R n are homeomorphic. Definition 11 (Smooth atlas (Lee, 2013)). A smooth atlas A for a topological n-manifold M is a collection of charts (Uα, φα) for M such that: • They cover M, i.e., S α∈A Uα = M. For example, see view at source ↗
Figure 8
Figure 8. Figure 8: A locally two-dimensional tangent space TpM at point p in a locally two-dimensional manifold M. Definition 16 (Riemannian manifold (Do Carmo, 1992; Lee, 2006; 2018)). Let M be a smooth manifold. A Rie￾mannian metric on M is a family of inner products: gp : TpM × TpM → R, p ∈ M, (4) such that: 1. for every p ∈ M, the map gp(·, ·) is an inner product on the tangent space TpM, and 2. the metric varies smoothl… view at source ↗
Figure 9
Figure 9. Figure 9: Intrinsic and extrinsic curvature: (a) a manifold being flat both intrinsically and extrinsically, (b) a manifold being flat intrinsically (i.e., is homeomorphic to a flat manifold) but curved extrinsically, and (c) a manifold being curved both intrinsically and extrinsically. The intrinsic curvature is the curvature felt by a small bug or ant on the manifold. For example, when a small bug is put on view at source ↗
Figure 11
Figure 11. Figure 11: Two occasions where curvilinear coordinate system may appear: (a) Curvilinear coordinate system chosen in an in￾trinsically flat manifold, (b) Curvilinear coordinate system in an intrinsically curved manifold. including tangent space, tangent vector, cotangent space, tangent bundle, cotangent bundle, vector, covector, vector field, tensor, tensor product, contravariant components, co￾variant components, a… view at source ↗
Figure 10
Figure 10. Figure 10: Coordinate systems: (a) Cartesian coordinate system, (b) scaled Cartesian coordinate system, (c) affine coordinate sys￾tem, and (d) curvilinear coordinate system. 4. Tensor Algebra (Tensor Calculus) Tensor algebra, also called the tensor calculus, is one of the important backbones of differential geometry4 . Here, we briefly introduce the building blocks of tensor calculus, 4There are many books on tensor… view at source ↗
Figure 13
Figure 13. Figure 13: 4.4. Cotangent Space and Covector Definition 25 (Covector (cotangent vector)). A covector, also called a cotangent vector or a dual vector, is a lin￾ear map that takes a vector and returns a real number. A view at source ↗
Figure 14
Figure 14. Figure 14: Vector field: (a) vectors at different points of manifold where the vectors may change from point to point, and (b) the tan￾gent spaces at different points of the manifold where each tangent vector lies in its tangent space. Note that for the sake of visual￾ization, we are showing a few points on the manifold. The vector field has a vector measurement at “every” point on the manifold and every point has i… view at source ↗
Figure 15
Figure 15. Figure 15: , a covector ωi is like scalar values as intersection of these parallel planes along x i . 4.10.2. GEOMETRIC INTERPRETATION OF CONTRAVARIANT AND COVARIANT COMPONENTS Consider a vector V in an n-dimensional coordinate sys￾tem. Although the coordinate system can be any coordi￾nate system—including Cartesian, affine, and curvilinear— consider a two-dimensional affine coordinate system for simplicity, as illu… view at source ↗
Figure 16
Figure 16. Figure 16: Geometric interpretation of contravariant components {V 1 , V 2 } and covariant components {V1, V2} of a vector V in a two-dimensional coordinate system (x 1 , x2 ). Proof. In Cartesian coordinate system—scaled or non￾scaled—the coordinate axes are orthogonal. Therefore, as shown in view at source ↗
Figure 17
Figure 17. Figure 17: Generalized Pythagorean theorem in (a) Cartesian coordinate system, (b) scaled Cartesian coordinate system, and (c) [scaled] affine coordinate system. Proof. Vi (134) = gijV j (a) =⇒ g kiVi = g kigijV j (135) =⇒ g kiVi = δ k j V j (63) =⇒ g kiVi = V k (b) =⇒ g ijVj = V i , where (a) is because of left-multiplying the expression sides by g ki and (b) is because of renaming the dummy variables k → i and i →… view at source ↗
Figure 18
Figure 18. Figure 18: Basis vectors {ej} 2 j=1 do not change in different co￾ordinates of the (a) Cartesian and (b) affine coordinate systems. However, the basis vectors {ej} 2 j=1 change in different coordi￾nates of the (c) curvilinear coordinate systems. Therefore, we have: ∂ej ∂xi = W (161) = Wkek (162) = W1e1 + W2e2 + · · · + Wnen. In the Cartesian coordinate system in Euclidean space, we can denote (rename) Wk in Eq. (162… view at source ↗
Figure 20
Figure 20. Figure 20: Going from point p1 to point p2 in two different in￾finitesimal paths: (a) In a flat space or a flat region of manifold, we end up with the same point p2 but (b) in a curvy region of manifold, we will end up in different points p2 which is because of effect of curvature. According to the discussion above, the curvature can be modeled as the difference between the covariant derivatives ∇j (∇iV k ) and ∇i(∇… view at source ↗
Figure 19
Figure 19. Figure 19: Going from point p1 to point p2 in two different in￾finitesimal paths: (1) first in the direction of dxi and then in the direction of dxj , or (2) first in the direction of dxj and then in the direction of dxi . Going from p1 along coordinate x i and then going along coordinate x j to the point p2 with infinitesimal steps can be modeled as: • ∇iV k denotes considering the k-th component of vec￾tor V going… view at source ↗
Figure 21
Figure 21. Figure 21: Ricci flow on an example manifold where the manifold gradually becomes smoother as a sphere and then disappears. The flow’s direction at each point is inverse of the sign of curvature; that is why Ricci flow has negative sign in Eq. (246). The magnitude of flow is proportional to the amount of curvature, as also obvious in Eq. (246). • Geometric Smoothing: Regularizing noisy manifold data by “flowing” the… view at source ↗
Figure 22
Figure 22. Figure 22: A vector field changing along a path on the manifold. The path is illustrated in red and the vectors of vector field are colored black. Definition 83 (Smooth curve or path). A smooth curve (or path) in an n-dimensional manifold M is denoted by γ and is a differentiable mapping from an interval of the real line into the manifold: γ : I → M, γ : t 7→ γ(t), ∀t ∈ I, (250) where I ⊆ R is an interval of the rea… view at source ↗
Figure 23
Figure 23. Figure 23: Holonomy (a closed loop) where we parallel transport a vector along a closed loop on a curvy manifold but the vector does not return as it started. This behavior is because of curvature of the manifold. This behavior is because, in a curved manifold, the geom￾etry warps the vector as it moves. Because the Christof￾fel symbols Γ i jk (which represent the connection) vary across the manifold, the integratio… view at source ↗
Figure 25
Figure 25. Figure 25: illustrates a case where two points are connected by multiple geodesics of the same length view at source ↗
Figure 26
Figure 26. Figure 26: (a) Euclidean optimization versus (b) Riemannian optimization. The cost function (contours of cost function) is colored in red, the coordinates are colored in blue, and the real axis for the cost function is colored in green. The red contours show the value levels of cost function in the space or manifold. or equivalently: minimize p f(p) subject to p ∈ M. (280) Remark 50 (Converting a constrained Euclide… view at source ↗
Figure 27
Figure 27. Figure 27: Optimization path on the manifold toward the local minimum of the cost function. The contours of cost function are colored in red. vector in TpM (see Definition 85). In the context of opti￾mization, the tangent vector (or velocity vector) represents the search direction. Using the coordinate basis {∂i} n i=1, the optimization path can be expressed locally via the coordinates x i (t). The ve￾locity vector … view at source ↗
Figure 28
Figure 28. Figure 28: Retraction map versus exponential map: the exponen￾tial map is depicted by a dashed curve on the manifold, while the retraction map moves from point along the tangent vector and then projects it back onto the manifold. The retraction map ap￾proximates the exponential map. 1. Identity: The retraction of zero vector in the tangent space at a point is the point itself: Retp(0p) = p, (326) where 0p is the zer… view at source ↗
Figure 29
Figure 29. Figure 29: Vector transport first performs the retraction Retp(η) on the tangent vector η ∈ TpM. Thus, it obtains the point Retp(η) on the manifold. This new point has a tangent space it￾self, namely TRetp(η)M. Then, the vector transport Tη(ξ) trans￾forms the vector ξ from tangent space TpM to the tangent space TRetp(η)M without change of its relative direction in the tangent space. retraction map. In other words, w… view at source ↗
read the original abstract

Riemannian geometry provides the fundamental framework for optimization on nonlinear spaces such as matrix manifolds, which arise in machine learning, signal processing, and robotics. While the underlying theory is classical, existing literature often presents results at a high level of abstraction, omitting the detailed coordinate-level derivations required for implementation and algorithm development. This work provides a self-contained and rigorous treatment of the foundations of Riemannian geometry, with a focus on explicit derivations tailored to Riemannian optimization. We systematically develop the key geometric structures -- including tangent and cotangent spaces, tensor calculus, metric tensors, Levi-Civita connections, curvature, and geodesics -- emphasizing step-by-step derivations in coordinates and matrix form. Building on these foundations, we derive the Riemannian gradient, Hessian, exponential map, and retraction in a form suitable for numerical computation. We further specialize these constructions to important matrix manifolds, including the Stiefel, Grassmann, and SPD (Symmetric Positive Definite) manifolds, providing explicit formulas widely used in optimization and geometric machine learning. This monograph develops a unified and implementation-oriented treatment of Riemannian geometry for optimization on manifolds. Its main contribution is the systematic organization and detailed derivation of classical geometric constructions in forms directly usable for algorithm design and numerical implementation. By connecting coordinate-level differential geometry with matrix-manifold formulas, the monograph bridges the gap between abstract theory and practical computation, and provides a reference for researchers and practitioners working in Riemannian optimization and related fields.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript is a monograph developing the foundations of Riemannian geometry with explicit, coordinate-level and matrix-form derivations for use in Riemannian optimization. It systematically treats tangent/cotangent spaces, tensor calculus, metric tensors, Levi-Civita connections, curvature, geodesics, Riemannian gradients/Hessians, exponential maps, and retractions, then specializes these constructions to the Stiefel, Grassmann, and SPD manifolds with formulas intended for direct numerical implementation.

Significance. If the derivations prove accurate and complete, the work supplies a useful, self-contained reference that organizes classical Riemannian geometry results into implementation-oriented forms. This addresses a genuine gap in the literature where abstract treatments often omit the coordinate details needed for algorithm design on matrix manifolds. The emphasis on step-by-step derivations and specialization to commonly used manifolds in geometric machine learning constitutes a practical contribution, even though the underlying mathematics is classical.

minor comments (2)
  1. [Abstract and Introduction] The abstract and introduction describe the derivations as 'rigorous' and 'self-contained' but provide no sample equations or proof outlines; ensure the main body contains fully expanded coordinate derivations for at least the Levi-Civita connection and exponential map on each manifold so readers can verify correctness without external references.
  2. [Sections on metric tensors and curvature] Notation for the metric tensor and its inverse is introduced early but used inconsistently in later matrix-manifold sections; adopt a single, uniform convention (e.g., always denoting the inverse metric explicitly) to avoid ambiguity in the curvature and Hessian formulas.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of our monograph and the recommendation of minor revision. The referee's summary correctly identifies the manuscript's focus on explicit coordinate-level and matrix-form derivations of Riemannian geometric structures for direct use in optimization algorithms.

Circularity Check

0 steps flagged

No significant circularity; classical derivations are self-contained

full rationale

The monograph systematically re-derives standard objects of Riemannian geometry (tangent spaces, Levi-Civita connection, curvature, exponential map, retraction) in coordinate and matrix form for the Stiefel, Grassmann, and SPD manifolds. These steps follow directly from the classical definitions of Riemannian metrics, covariant derivatives, and geodesics without introducing fitted parameters, self-referential equations, or load-bearing self-citations that reduce the central claims to their own inputs. The paper explicitly positions its contribution as explicit, implementation-oriented re-derivations of well-established results rather than novel theorems, rendering the derivation chain independent of any circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The work rests entirely on standard axioms of smooth manifolds and Riemannian metrics; no free parameters, ad-hoc axioms, or invented entities are introduced or needed for the claimed contribution.

axioms (2)
  • standard math Existence and uniqueness of the Levi-Civita connection on a Riemannian manifold
    Invoked when deriving the Riemannian connection and geodesics for optimization.
  • domain assumption Smooth manifold structure on the Stiefel, Grassmann, and SPD matrix spaces
    Required to apply the general theory to the concrete manifolds listed in the abstract.

pith-pipeline@v0.9.0 · 5569 in / 1317 out tokens · 56689 ms · 2026-05-08T18:18:00.185297+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Reference graph

Works this paper leans on

16 extracted references · 14 canonical work pages

  1. [1]

    A Finsler geometrical programming for the nonlinear complementarity problem of traffic equi- librium.arXiv preprint arXiv:2109.01256,

    Asanjarani, Azam. A Finsler geometrical programming for the nonlinear complementarity problem of traffic equi- librium.arXiv preprint arXiv:2109.01256,

  2. [2]

    On symplectic optimization

    Betancourt, Michael, Jordan, Michael I, and Wilson, Ashia C. On symplectic optimization.arXiv preprint arXiv:1802.03653,

  3. [3]

    Cotton, ´Emile

    doi: 10.1515/crll.1869.70.46. Cotton, ´Emile. Sur les vari ´et´es `a trois dimensions.An- nales de la Facult ´e des sciences de l’Universit ´e de Toulouse pour les sciences math´ematiques et les sciences physiques, 1(4):385–438,

  4. [4]

    Optimization on manifolds: A symplectic approach.arXiv preprint arXiv:2107.11231,

    Franc ¸a, Guilherme, Barp, Alessandro, Girolami, Mark, and Jordan, Michael I. Optimization on manifolds: A symplectic approach.arXiv preprint arXiv:2107.11231,

  5. [5]

    Eigenvalue and g eneral- ized eigenvalue problems: Tutorial,

    Ghojogh, Benyamin, Karray, Fakhri, and Crowley, Mark. Eigenvalue and generalized eigenvalue problems: Tuto- rial.arXiv preprint arXiv:1903.11240,

  6. [6]

    KKT conditions, first-order and second- order optimization, and distributed optimization: Tuto- rial and survey.arXiv preprint arXiv:2110.01858,

    Ghojogh, Benyamin, Ghodsi, Ali, Karray, Fakhri, and Crowley, Mark. KKT conditions, first-order and second- order optimization, and distributed optimization: Tuto- rial and survey.arXiv preprint arXiv:2110.01858,

  7. [7]

    Spectral, probabilistic, and deep met- ric learning: Tutorial and survey.arXiv preprint arXiv:2201.09267, 2022a

    Ghojogh, Benyamin, Ghodsi, Ali, Karray, Fakhri, and Crowley, Mark. Spectral, probabilistic, and deep met- ric learning: Tutorial and survey.arXiv preprint arXiv:2201.09267, 2022a. Ghojogh, Benyamin, Karray, Fakhri, and Crowley, Mark. On manifold hypothesis: Hypersurface submanifold em- bedding using osculating hyperspheres.arXiv preprint arXiv:2202.01619,...

  8. [8]

    Mixest: An estimation toolbox for mixture models.arXiv preprint arXiv:1507.06065,

    Hosseini, Reshad and Mash’al, Mohamadreza. Mixest: An estimation toolbox for mixture models.arXiv preprint arXiv:1507.06065,

  9. [9]

    A Rieman- nian BFGS method for nonconvex optimization prob- lems

    Huang, Wen, Absil, P-A, and Gallivan, Kyle A. A Rieman- nian BFGS method for nonconvex optimization prob- lems. InNumerical Mathematics and Advanced Appli- cations ENUMATH 2015, pp. 627–634. Springer,

  10. [10]

    Rie- mannian stochastic variance reduced gradient on Grass- mann manifold.arXiv preprint arXiv:1605.07367,

    Kasai, Hiroyuki, Sato, Hiroyuki, and Mishra, Bamdev. Rie- mannian stochastic variance reduced gradient on Grass- mann manifold.arXiv preprint arXiv:1605.07367,

  11. [11]

    Geoopt: Riemannian optimization in PyTorch.arXiv preprint arXiv:2005.02819,

    Kochurov, Max, Karimov, Rasul, and Kozlukov, Serge. Geoopt: Riemannian optimization in PyTorch.arXiv preprint arXiv:2005.02819,

  12. [12]

    Foundations of Riemannian Geometry for Riemannian Optimization: A Monograph with Detailed Derivations142 Li, Dong-Hui and Fukushima, Masao

    doi: 10.1007/BF03014898. Foundations of Riemannian Geometry for Riemannian Optimization: A Monograph with Detailed Derivations142 Li, Dong-Hui and Fukushima, Masao. On the global con- vergence of the BFGS method for nonconvex uncon- strained optimization problems.SIAM Journal on Op- timization, 11(4):1054–1064,

  13. [13]

    The entropy formula for the Ricci flow and its geometric applications

    Perelman, Grisha. The entropy formula for the Ricci flow and its geometric applications.arXiv preprint math/0211159,

  14. [14]

    Finite extinction time for the solutions to the Ricci flow on certain three-manifolds

    Perelman, Grisha. Finite extinction time for the solutions to the Ricci flow on certain three-manifolds.arXiv preprint math/0307245, 2003a. Perelman, Grisha. Ricci flow with surgery on three- manifolds.arXiv preprint math/0303109, 2003b. Poincar´e, MH. Cinqui `eme compl ´ement `a l’analysis situs [fifth supplement to analysis situs].Rendiconti del Cir- co...

  15. [15]

    Pymanopt: A Python toolbox for optimization on manifolds using automatic differentiation.arXiv preprint arXiv:1603.03236,

    Townsend, James, Koep, Niklas, and Weichwald, Sebas- tian. Pymanopt: A Python toolbox for optimization on manifolds using automatic differentiation.arXiv preprint arXiv:1603.03236,

  16. [16]

    Rieman- nian SVRG: Fast stochastic optimization on Riemannian manifolds.Advances in Neural Information Processing Systems, 29, 2016

    Zhang, Hongyi, J Reddi, Sashank, and Sra, Suvrit. Rieman- nian SVRG: Fast stochastic optimization on Riemannian manifolds.Advances in Neural Information Processing Systems, 29, 2016