Recognition: unknown
Finding accurate eigenvalues and eigenvectors of positive semi-definite matrices given a subspace
Pith reviewed 2026-05-08 16:20 UTC · model grok-4.3
The pith
Nyström's method gives more accurate leading eigenvalues than Rayleigh-Ritz for positive semi-definite matrices given an approximating subspace.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Given a k-dimensional subspace Q that approximates the leading eigenspace of an n by n positive semi-definite matrix A, the Nyström method produces approximate eigenvalues that are always more accurate than those from Rayleigh-Ritz, the improvement can be arbitrarily large, and both methods share the same leading computational cost of forming A Q plus an O(n k squared) term.
What carries the argument
Nyström's method, which forms the small Gram matrix from the action of A on the subspace basis and extracts its eigenvalues as the approximations to those of A.
If this is right
- Nyström eigenvalues always have higher accuracy than Rayleigh-Ritz eigenvalues for the leading part.
- The accuracy gain can be made arbitrarily large by choosing A with sufficiently fast spectral decay.
- A similar accuracy gain is observed for the leading eigenvectors.
- When the target is instead the trailing eigenvalues, Nyström accuracy drops and Rayleigh-Ritz becomes preferable.
- A simple remedy restores good performance for the trailing case.
Where Pith is reading between the lines
- The result indicates that positivity alone changes the optimality of the standard extraction step.
- Existing codes that already compute A Q can switch to Nyström at negligible extra cost for leading-spectrum problems.
- The reversal for trailing eigenvalues suggests the extraction choice should depend on which part of the spectrum is sought.
Load-bearing premise
The input subspace approximates the leading eigenspace of the positive semi-definite matrix.
What would settle it
Take any positive semi-definite matrix with a rapidly decaying spectrum, form a subspace close to its leading eigenspace, and compute the eigenvalue errors; if Rayleigh-Ritz errors are never larger than Nyström errors, the claimed superiority is false.
read the original abstract
We revisit a classical problem in numerical linear algebra: given an $k$-dimensional subspace $\mathcal{Q}$ that approximates the leading eigenspace of an $n\times n$ positive semi-definite matrix $A$, the goal is to extract high-accuracy eigenvalues. The Rayleigh-Ritz (RR) method is the standard algorithm for the task, which has been shown to be optimal in several ways (when $A$ is symmetric, not necessarily positive semi-definite $A\succeq 0$). In this paper, we show that when $A \succeq 0$, alternative methods can outperform RR, while having the same computational complexity, that is, the main cost is in computing $AQ$, plus an $O(nk^2)$ term. In particular, we advocate the use of Nystr{\"o}m's method, showing that the approximate eigenvalues always have higher accuracy than RR, and the improvement can be arbitrarily large. The difference is significant, especially when $A$ has a fast-decaying spectrum. A similar improvement is numerically observed for the purpose of approximating the leading eigenvectors. In contrast, when the target eigenvalues are the trailing ones, the situation is reversed, and the Nystr{\"o}m method performs poorly; we suggest a remedy for this situation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that, given a k-dimensional subspace Q approximating the leading eigenspace of an n×n PSD matrix A, Nyström's method (using the approximate matrix A Q (Qᵀ A Q)⁻¹ Qᵀ A) yields eigenvalues with strictly higher accuracy than the Rayleigh-Ritz (RR) method for all k eigenvalues, with the improvement arbitrarily large, while retaining the same leading computational cost of forming A Q plus O(n k²) work. It further reports numerical improvement for eigenvectors and proposes a remedy when targeting trailing eigenvalues instead.
Significance. If the central accuracy claim holds with a complete proof, the result would be a practically useful refinement of the standard RR procedure for PSD matrices with rapidly decaying spectra, at no extra asymptotic cost. The paper correctly identifies that RR optimality proofs assume symmetry without the PSD property and supplies a concrete, implementable alternative together with a trailing-eigenvalue fix.
major comments (2)
- [§3.2] §3.2 (variational argument for the largest eigenvalue): the Cauchy-Schwarz argument correctly shows that the largest Nyström eigenvalue is at least as large as the largest RR eigenvalue and therefore has smaller or equal absolute error to the true λ₁. No analogous min-max characterization is supplied for the j-th largest eigenvalue when j>1; the ordering of the remaining generalized eigenvalues of (Qᵀ A² Q, Qᵀ A Q) does not automatically guarantee smaller deviations from the corresponding true eigenvalues. The plural claim that 'the approximate eigenvalues always have higher accuracy' therefore rests on an unproven extension.
- [Theorem 3.1] Theorem 3.1 / Corollary 3.2: the statement that the improvement 'can be arbitrarily large' is illustrated only by a single 2×2 example; a quantitative bound relating the gap between Nyström and RR errors to the decay rate of the spectrum of A or to the angle between Q and the true eigenspace is missing. Without such a bound the 'arbitrarily large' assertion remains an existence claim rather than a practical guarantee.
minor comments (2)
- Notation: the symbol μ is used both for RR eigenvalues and for the true eigenvalues in different paragraphs; a consistent distinction (e.g., μ_RR vs. λ) would improve readability.
- Figure 2 caption: the legend labels 'Nyström' and 'RR' but the y-axis is labeled 'relative error'; it is unclear whether the plotted quantity is |λ̂ - λ| / λ or the absolute error.
Simulated Author's Rebuttal
We thank the referee for the thorough review and valuable comments on the proof structure and the quantification of improvements. We agree that strengthening the argument for all eigenvalues and providing more analysis on the size of the improvement will improve the manuscript. We respond to each major comment below and will revise accordingly.
read point-by-point responses
-
Referee: [§3.2] §3.2 (variational argument for the largest eigenvalue): the Cauchy-Schwarz argument correctly shows that the largest Nyström eigenvalue is at least as large as the largest RR eigenvalue and therefore has smaller or equal absolute error to the true λ₁. No analogous min-max characterization is supplied for the j-th largest eigenvalue when j>1; the ordering of the remaining generalized eigenvalues of (Qᵀ A² Q, Qᵀ A Q) does not automatically guarantee smaller deviations from the corresponding true eigenvalues. The plural claim that 'the approximate eigenvalues always have higher accuracy' therefore rests on an unproven extension.
Authors: We thank the referee for identifying this gap. The Cauchy-Schwarz argument in §3.2 is presented for the largest eigenvalue. For j > 1 the manuscript invokes the ordering of the generalized eigenvalues of the pair (Qᵀ A² Q, Qᵀ A Q) together with the PSD property of A, but does not supply an explicit min-max proof that each individual Nyström eigenvalue has smaller or equal error. We will add a rigorous extension using the variational characterization of generalized eigenvalues (adapted Ky Fan or Courant-Fischer-type arguments that exploit A ≽ 0) and will update the statements in §3.2 and the abstract to reflect the strengthened proof. revision: yes
-
Referee: [Theorem 3.1] Theorem 3.1 / Corollary 3.2: the statement that the improvement 'can be arbitrarily large' is illustrated only by a single 2×2 example; a quantitative bound relating the gap between Nyström and RR errors to the decay rate of the spectrum of A or to the angle between Q and the true eigenspace is missing. Without such a bound the 'arbitrarily large' assertion remains an existence claim rather than a practical guarantee.
Authors: The 2×2 counter-example demonstrates that the error ratio can be driven to infinity by suitable choice of spectrum decay and subspace angle. We agree that a general quantitative bound would be desirable. In the revision we will derive an explicit relation between the error gap and the principal angles between Q and the leading eigenspace together with the tail sum of the spectrum of A (or the spectral gap after the k-th eigenvalue). If a simple closed-form bound proves elusive under minimal assumptions, we will at least supply additional analytic estimates and numerical illustrations that quantify the improvement under rapid spectral decay. This moves the claim from pure existence toward a practical guarantee. revision: partial
Circularity Check
No circularity: Nyström superiority derived from external variational inequalities and standard RR comparison
full rationale
The central claim rests on a direct application of the Cauchy-Schwarz inequality to the generalized Rayleigh quotient for the largest eigenvalue, showing λ_N,1 ≥ μ_RR,1 while both are bounded above by the true λ_1. This reduction uses only the PSD assumption, orthonormality of Q, and the definition of the Nyström matrix; it does not redefine any quantity in terms of the claimed result. Extension to the full set of k eigenvalues is presented via spectral approximation properties of the Nyström operator or numerical evidence, without reducing to a fitted parameter, self-citation chain, or ansatz smuggled from prior work by the same authors. The comparison target (Rayleigh-Ritz) is the externally known standard method whose optimality properties are independent of this paper. No load-bearing step collapses by construction to the input subspace or to a self-referential definition.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption A is positive semi-definite
- domain assumption Q is a k-dimensional subspace approximating the leading eigenspace
Reference graph
Works this paper leans on
-
[1]
Abedsoltan, M
A. Abedsoltan, M. Belkin, and P. Pandit , Toward large kernel models, in Proceedings of the 40th International Conference on Machine Learning, vol. 20 2 of Proceedings of Machine Learning Research, PMLR, 2023, pp. 61–78
2023
-
[2]
Ambikasaran, D
S. Ambikasaran, D. Foreman-Mackey, L. Greengard, D. W. Hogg, a nd M. O’Neil , Fast direct methods for Gaussian processes , IEEE Trans. Pattern Anal. Mach. Intell., 38 (2016), pp. 252–265
2016
-
[3]
Axelsson and V
O. Axelsson and V. A. Barker , Finite Element Solution of Boundary Value Problems: Theory and Computation , SIAM, Philadelphia, PA, 2001
2001
-
[4]
Belabbas and P
M.-A. Belabbas and P. J. Wolfe , Spectral methods in machine learning and new strategies for very large datasets , Proc. Natl. Acad. Sci. USA, 106 (2009), pp. 369–374
2009
-
[5]
C. M. Bishop , Pattern Recognition and Machine Learning , Springer, New York, 2006
2006
-
[6]
Boffi, Finite element approximation of eigenvalue problems , Acta Numer., 19 (2010), pp
D. Boffi, Finite element approximation of eigenvalue problems , Acta Numer., 19 (2010), pp. 1– 120
2010
- [7]
-
[8]
Y. Chen, E. N. Epperly, J. A. Tropp, and R. J. Webber , Randomly pivoted Cholesky: Practical approximation of a kernel matrix with few entry ev aluations, Comm. Pure Appl. Math., 78 (2025), pp. 995–1041
2025
-
[9]
Damle, S
A. Damle, S. Glas, A. Townsend, and A. Yu , Estimating a matrix’s singular values with interpolative decompositions, Linear Algebra Appl., 731 (2026), pp. 306–342
2026
-
[10]
Fowlkes, S
C. Fowlkes, S. Belongie, F. Chung, and J. Malik , Spectral grouping using the Nystr¨ om method, IEEE Trans. Pattern Anal. Mach. Intell., 26 (2004), pp. 214 –225
2004
-
[11]
Gittens and M
A. Gittens and M. W. Mahoney , Revisiting the Nystr¨ om method for improved large-scale machine learning , J. Mach. Learn. Res., 17 (2016), pp. 3977–4041
2016
-
[12]
Gu , Subspace iteration randomization and singular value probl ems, SIAM J
M. Gu , Subspace iteration randomization and singular value probl ems, SIAM J. Sci. Comput., 37 (2015), pp. A1139–A1173. FINDING ACCURATE EIGENV ALUES GIVEN A SUBSPACE 19
2015
-
[13]
Halko, P.-G
N. Halko, P.-G. Martinsson, and J. A. Tropp , Finding structure with randomness: Prob- abilistic algorithms for constructing approximate matrix decompositions, SIAM Rev., 53 (2011), pp. 217–288
2011
-
[14]
Hu and S.-Y
Y. Hu and S.-Y. Kung , Toeplitz eigensystem solver , IEEE Trans. Acoust. Speech Signal Process., 33 (1985), pp. 1264–1271
1985
-
[15]
Lazzarino, H
L. Lazzarino, H. Al Daas, and Y. Nakatsukasa , Matrix perturbation analysis of methods for extracting singular values from approximate singular s ubspaces, SIAM J. Matrix Anal. Appl., 46 (2025), pp. 2614–2634
2025
-
[16]
Martinsson and J
P.-G. Martinsson and J. A. Tropp , Randomized numerical linear algebra: Foundations and algorithms, Acta Numer., 29 (2020), pp. 403–572
2020
-
[17]
Mastronardi and D
N. Mastronardi and D. Boley , Computing the smallest eigenpair of a symmetric positive definite Toeplitz matrix , SIAM J. Sci. Comput., 20 (1999), pp. 1921–1927
1999
-
[18]
Meanti, L
G. Meanti, L. Carratino, L. Rosasco, and A. Rudi , Kernel methods through the roof: Han- dling billions of points efficiently , in Advances in Neural Information Processing Systems, vol. 33, 2020, pp. 14410–14422
2020
-
[19]
H. Mena, A. Ostermann, L.-M. Pfurtscheller, and C. Piazzola , Numerical low-rank ap- proximation of matrix differential equations , J. Comput. Appl. Math., 340 (2018), pp. 602– 614
2018
-
[20]
Nakatsukasa , Eigenvalue perturbation bounds for Hermitian block tridia gonal matrices , Appl
Y. Nakatsukasa , Eigenvalue perturbation bounds for Hermitian block tridia gonal matrices , Appl. Numer. Math., 62 (2012), pp. 67–78
2012
-
[21]
Nakatsukasa, Sharp error bounds for Ritz vectors and approximate singula r vectors, Math
Y. Nakatsukasa, Sharp error bounds for Ritz vectors and approximate singula r vectors, Math. Comp., 89 (2020), pp. 1843–1866
2020
-
[22]
B. N. Parlett , The Symmetric Eigenvalue Problem , SIAM, Philadelphia, PA, 1998
1998
-
[23]
A. Rudi, L. Carratino, and L. Rosasco , F ALKON: An optimal large scale kernel method , in Advances in Neural Information Processing Systems, vol. 30, 2017
2017
-
[24]
Saad, Numerical Methods for Large Eigenvalue Problems: Revised E dition, SIAM, Philadel- phia, PA, 2011
Y. Saad, Numerical Methods for Large Eigenvalue Problems: Revised E dition, SIAM, Philadel- phia, PA, 2011
2011
-
[25]
G. W. Stew art, On the perturbation of pseudo-inverses, projections and li near least squares problems, SIAM Rev., 19 (1977), pp. 634–662
1977
- [26]
-
[27]
J. A. Tropp, A. Yurtsever, M. Udell, and V. Cevher , Fixed-rank approximation of a positive-semidefinite matrix from streaming data , in Advances in Neural Information Pro- cessing Systems, vol. 30, 2017
2017
-
[28]
C. K. I. Williams and M. Seeger , Using the Nystr¨ om method to speed up kernel machines , in Advances in Neural Information Processing Systems, vol. 13, 2000
2000
-
[29]
Zhang , The Schur Complement and Its Applications , vol
F. Zhang , The Schur Complement and Its Applications , vol. 4, Springer, 2006
2006
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.