arxiv: 2605.05043 · v1 · submitted 2026-05-06 · 🧮 math.NA · cs.NA

Recognition: unknown

Finding accurate eigenvalues and eigenvectors of positive semi-definite matrices given a subspace

Yuji Nakatsukasa, Zheng Tang

Pith reviewed 2026-05-08 16:20 UTC · model grok-4.3

classification 🧮 math.NA cs.NA

keywords eigenvaluespositive semi-definite matricesRayleigh-Ritz methodNyström methodsubspace approximationeigenvectorsnumerical linear algebraspectral approximation

0 comments

The pith

Nyström's method gives more accurate leading eigenvalues than Rayleigh-Ritz for positive semi-definite matrices given an approximating subspace.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that when a k-dimensional subspace approximates the leading eigenspace of an n by n positive semi-definite matrix A, Nyström's method extracts eigenvalues with strictly higher accuracy than the Rayleigh-Ritz method while using the same computational cost. The main expense remains computing the product A times the subspace basis, followed by an O(n k squared) step. The accuracy advantage holds for all such matrices and can grow without bound, especially when the spectrum of A decays rapidly. A comparable gain appears numerically for the leading eigenvectors, though the advantage reverses for trailing eigenvalues where a correction is proposed.

Core claim

Given a k-dimensional subspace Q that approximates the leading eigenspace of an n by n positive semi-definite matrix A, the Nyström method produces approximate eigenvalues that are always more accurate than those from Rayleigh-Ritz, the improvement can be arbitrarily large, and both methods share the same leading computational cost of forming A Q plus an O(n k squared) term.

What carries the argument

Nyström's method, which forms the small Gram matrix from the action of A on the subspace basis and extracts its eigenvalues as the approximations to those of A.

If this is right

Nyström eigenvalues always have higher accuracy than Rayleigh-Ritz eigenvalues for the leading part.
The accuracy gain can be made arbitrarily large by choosing A with sufficiently fast spectral decay.
A similar accuracy gain is observed for the leading eigenvectors.
When the target is instead the trailing eigenvalues, Nyström accuracy drops and Rayleigh-Ritz becomes preferable.
A simple remedy restores good performance for the trailing case.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The result indicates that positivity alone changes the optimality of the standard extraction step.
Existing codes that already compute A Q can switch to Nyström at negligible extra cost for leading-spectrum problems.
The reversal for trailing eigenvalues suggests the extraction choice should depend on which part of the spectrum is sought.

Load-bearing premise

The input subspace approximates the leading eigenspace of the positive semi-definite matrix.

What would settle it

Take any positive semi-definite matrix with a rapidly decaying spectrum, form a subspace close to its leading eigenspace, and compute the eigenvalue errors; if Rayleigh-Ritz errors are never larger than Nyström errors, the claimed superiority is false.

read the original abstract

We revisit a classical problem in numerical linear algebra: given an $k$-dimensional subspace $\mathcal{Q}$ that approximates the leading eigenspace of an $n\times n$ positive semi-definite matrix $A$, the goal is to extract high-accuracy eigenvalues. The Rayleigh-Ritz (RR) method is the standard algorithm for the task, which has been shown to be optimal in several ways (when $A$ is symmetric, not necessarily positive semi-definite $A\succeq 0$). In this paper, we show that when $A \succeq 0$, alternative methods can outperform RR, while having the same computational complexity, that is, the main cost is in computing $AQ$, plus an $O(nk^2)$ term. In particular, we advocate the use of Nystr{\"o}m's method, showing that the approximate eigenvalues always have higher accuracy than RR, and the improvement can be arbitrarily large. The difference is significant, especially when $A$ has a fast-decaying spectrum. A similar improvement is numerically observed for the purpose of approximating the leading eigenvectors. In contrast, when the target eigenvalues are the trailing ones, the situation is reversed, and the Nystr{\"o}m method performs poorly; we suggest a remedy for this situation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that, given a k-dimensional subspace Q approximating the leading eigenspace of an n×n PSD matrix A, Nyström's method (using the approximate matrix A Q (Qᵀ A Q)⁻¹ Qᵀ A) yields eigenvalues with strictly higher accuracy than the Rayleigh-Ritz (RR) method for all k eigenvalues, with the improvement arbitrarily large, while retaining the same leading computational cost of forming A Q plus O(n k²) work. It further reports numerical improvement for eigenvectors and proposes a remedy when targeting trailing eigenvalues instead.

Significance. If the central accuracy claim holds with a complete proof, the result would be a practically useful refinement of the standard RR procedure for PSD matrices with rapidly decaying spectra, at no extra asymptotic cost. The paper correctly identifies that RR optimality proofs assume symmetry without the PSD property and supplies a concrete, implementable alternative together with a trailing-eigenvalue fix.

major comments (2)

[§3.2] §3.2 (variational argument for the largest eigenvalue): the Cauchy-Schwarz argument correctly shows that the largest Nyström eigenvalue is at least as large as the largest RR eigenvalue and therefore has smaller or equal absolute error to the true λ₁. No analogous min-max characterization is supplied for the j-th largest eigenvalue when j>1; the ordering of the remaining generalized eigenvalues of (Qᵀ A² Q, Qᵀ A Q) does not automatically guarantee smaller deviations from the corresponding true eigenvalues. The plural claim that 'the approximate eigenvalues always have higher accuracy' therefore rests on an unproven extension.
[Theorem 3.1] Theorem 3.1 / Corollary 3.2: the statement that the improvement 'can be arbitrarily large' is illustrated only by a single 2×2 example; a quantitative bound relating the gap between Nyström and RR errors to the decay rate of the spectrum of A or to the angle between Q and the true eigenspace is missing. Without such a bound the 'arbitrarily large' assertion remains an existence claim rather than a practical guarantee.

minor comments (2)

Notation: the symbol μ is used both for RR eigenvalues and for the true eigenvalues in different paragraphs; a consistent distinction (e.g., μ_RR vs. λ) would improve readability.
Figure 2 caption: the legend labels 'Nyström' and 'RR' but the y-axis is labeled 'relative error'; it is unclear whether the plotted quantity is |λ̂ - λ| / λ or the absolute error.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough review and valuable comments on the proof structure and the quantification of improvements. We agree that strengthening the argument for all eigenvalues and providing more analysis on the size of the improvement will improve the manuscript. We respond to each major comment below and will revise accordingly.

read point-by-point responses

Referee: [§3.2] §3.2 (variational argument for the largest eigenvalue): the Cauchy-Schwarz argument correctly shows that the largest Nyström eigenvalue is at least as large as the largest RR eigenvalue and therefore has smaller or equal absolute error to the true λ₁. No analogous min-max characterization is supplied for the j-th largest eigenvalue when j>1; the ordering of the remaining generalized eigenvalues of (Qᵀ A² Q, Qᵀ A Q) does not automatically guarantee smaller deviations from the corresponding true eigenvalues. The plural claim that 'the approximate eigenvalues always have higher accuracy' therefore rests on an unproven extension.

Authors: We thank the referee for identifying this gap. The Cauchy-Schwarz argument in §3.2 is presented for the largest eigenvalue. For j > 1 the manuscript invokes the ordering of the generalized eigenvalues of the pair (Qᵀ A² Q, Qᵀ A Q) together with the PSD property of A, but does not supply an explicit min-max proof that each individual Nyström eigenvalue has smaller or equal error. We will add a rigorous extension using the variational characterization of generalized eigenvalues (adapted Ky Fan or Courant-Fischer-type arguments that exploit A ≽ 0) and will update the statements in §3.2 and the abstract to reflect the strengthened proof. revision: yes
Referee: [Theorem 3.1] Theorem 3.1 / Corollary 3.2: the statement that the improvement 'can be arbitrarily large' is illustrated only by a single 2×2 example; a quantitative bound relating the gap between Nyström and RR errors to the decay rate of the spectrum of A or to the angle between Q and the true eigenspace is missing. Without such a bound the 'arbitrarily large' assertion remains an existence claim rather than a practical guarantee.

Authors: The 2×2 counter-example demonstrates that the error ratio can be driven to infinity by suitable choice of spectrum decay and subspace angle. We agree that a general quantitative bound would be desirable. In the revision we will derive an explicit relation between the error gap and the principal angles between Q and the leading eigenspace together with the tail sum of the spectrum of A (or the spectral gap after the k-th eigenvalue). If a simple closed-form bound proves elusive under minimal assumptions, we will at least supply additional analytic estimates and numerical illustrations that quantify the improvement under rapid spectral decay. This moves the claim from pure existence toward a practical guarantee. revision: partial

Circularity Check

0 steps flagged

No circularity: Nyström superiority derived from external variational inequalities and standard RR comparison

full rationale

The central claim rests on a direct application of the Cauchy-Schwarz inequality to the generalized Rayleigh quotient for the largest eigenvalue, showing λ_N,1 ≥ μ_RR,1 while both are bounded above by the true λ_1. This reduction uses only the PSD assumption, orthonormality of Q, and the definition of the Nyström matrix; it does not redefine any quantity in terms of the claimed result. Extension to the full set of k eigenvalues is presented via spectral approximation properties of the Nyström operator or numerical evidence, without reducing to a fitted parameter, self-citation chain, or ansatz smuggled from prior work by the same authors. The comparison target (Rayleigh-Ritz) is the externally known standard method whose optimality properties are independent of this paper. No load-bearing step collapses by construction to the input subspace or to a self-referential definition.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the standard assumption that A is positive semi-definite and that the given subspace approximates its leading eigenspace; no free parameters or invented entities are introduced in the abstract.

axioms (2)

domain assumption A is positive semi-definite
Required for the dominance of Nyström over Rayleigh-Ritz to hold.
domain assumption Q is a k-dimensional subspace approximating the leading eigenspace
The input to the problem; the accuracy claims are conditional on this approximation quality.

pith-pipeline@v0.9.0 · 5530 in / 1204 out tokens · 50727 ms · 2026-05-08T16:20:23.603245+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

29 extracted references · 2 canonical work pages

[1]

Abedsoltan, M

A. Abedsoltan, M. Belkin, and P. Pandit , Toward large kernel models, in Proceedings of the 40th International Conference on Machine Learning, vol. 20 2 of Proceedings of Machine Learning Research, PMLR, 2023, pp. 61–78

2023
[2]

Ambikasaran, D

S. Ambikasaran, D. Foreman-Mackey, L. Greengard, D. W. Hogg, a nd M. O’Neil , Fast direct methods for Gaussian processes , IEEE Trans. Pattern Anal. Mach. Intell., 38 (2016), pp. 252–265

2016
[3]

Axelsson and V

O. Axelsson and V. A. Barker , Finite Element Solution of Boundary Value Problems: Theory and Computation , SIAM, Philadelphia, PA, 2001

2001
[4]

Belabbas and P

M.-A. Belabbas and P. J. Wolfe , Spectral methods in machine learning and new strategies for very large datasets , Proc. Natl. Acad. Sci. USA, 106 (2009), pp. 369–374

2009
[5]

C. M. Bishop , Pattern Recognition and Machine Learning , Springer, New York, 2006

2006
[6]

Boffi, Finite element approximation of eigenvalue problems , Acta Numer., 19 (2010), pp

D. Boffi, Finite element approximation of eigenvalue problems , Acta Numer., 19 (2010), pp. 1– 120

2010
[7]

Bucci, Y

A. Bucci, Y. Nakatsukasa, and T. Park , Numerical stability of the Nystr¨ om method , 2025, https://arxiv.org/abs/2511.15583

work page arXiv 2025
[8]

Y. Chen, E. N. Epperly, J. A. Tropp, and R. J. Webber , Randomly pivoted Cholesky: Practical approximation of a kernel matrix with few entry ev aluations, Comm. Pure Appl. Math., 78 (2025), pp. 995–1041

2025
[9]

Damle, S

A. Damle, S. Glas, A. Townsend, and A. Yu , Estimating a matrix’s singular values with interpolative decompositions, Linear Algebra Appl., 731 (2026), pp. 306–342

2026
[10]

Fowlkes, S

C. Fowlkes, S. Belongie, F. Chung, and J. Malik , Spectral grouping using the Nystr¨ om method, IEEE Trans. Pattern Anal. Mach. Intell., 26 (2004), pp. 214 –225

2004
[11]

Gittens and M

A. Gittens and M. W. Mahoney , Revisiting the Nystr¨ om method for improved large-scale machine learning , J. Mach. Learn. Res., 17 (2016), pp. 3977–4041

2016
[12]

Gu , Subspace iteration randomization and singular value probl ems, SIAM J

M. Gu , Subspace iteration randomization and singular value probl ems, SIAM J. Sci. Comput., 37 (2015), pp. A1139–A1173. FINDING ACCURATE EIGENV ALUES GIVEN A SUBSPACE 19

2015
[13]

Halko, P.-G

N. Halko, P.-G. Martinsson, and J. A. Tropp , Finding structure with randomness: Prob- abilistic algorithms for constructing approximate matrix decompositions, SIAM Rev., 53 (2011), pp. 217–288

2011
[14]

Hu and S.-Y

Y. Hu and S.-Y. Kung , Toeplitz eigensystem solver , IEEE Trans. Acoust. Speech Signal Process., 33 (1985), pp. 1264–1271

1985
[15]

Lazzarino, H

L. Lazzarino, H. Al Daas, and Y. Nakatsukasa , Matrix perturbation analysis of methods for extracting singular values from approximate singular s ubspaces, SIAM J. Matrix Anal. Appl., 46 (2025), pp. 2614–2634

2025
[16]

Martinsson and J

P.-G. Martinsson and J. A. Tropp , Randomized numerical linear algebra: Foundations and algorithms, Acta Numer., 29 (2020), pp. 403–572

2020
[17]

Mastronardi and D

N. Mastronardi and D. Boley , Computing the smallest eigenpair of a symmetric positive deﬁnite Toeplitz matrix , SIAM J. Sci. Comput., 20 (1999), pp. 1921–1927

1999
[18]

Meanti, L

G. Meanti, L. Carratino, L. Rosasco, and A. Rudi , Kernel methods through the roof: Han- dling billions of points eﬃciently , in Advances in Neural Information Processing Systems, vol. 33, 2020, pp. 14410–14422

2020
[19]

H. Mena, A. Ostermann, L.-M. Pfurtscheller, and C. Piazzola , Numerical low-rank ap- proximation of matrix diﬀerential equations , J. Comput. Appl. Math., 340 (2018), pp. 602– 614

2018
[20]

Nakatsukasa , Eigenvalue perturbation bounds for Hermitian block tridia gonal matrices , Appl

Y. Nakatsukasa , Eigenvalue perturbation bounds for Hermitian block tridia gonal matrices , Appl. Numer. Math., 62 (2012), pp. 67–78

2012
[21]

Nakatsukasa, Sharp error bounds for Ritz vectors and approximate singula r vectors, Math

Y. Nakatsukasa, Sharp error bounds for Ritz vectors and approximate singula r vectors, Math. Comp., 89 (2020), pp. 1843–1866

2020
[22]

B. N. Parlett , The Symmetric Eigenvalue Problem , SIAM, Philadelphia, PA, 1998

1998
[23]

A. Rudi, L. Carratino, and L. Rosasco , F ALKON: An optimal large scale kernel method , in Advances in Neural Information Processing Systems, vol. 30, 2017

2017
[24]

Saad, Numerical Methods for Large Eigenvalue Problems: Revised E dition, SIAM, Philadel- phia, PA, 2011

Y. Saad, Numerical Methods for Large Eigenvalue Problems: Revised E dition, SIAM, Philadel- phia, PA, 2011

2011
[25]

G. W. Stew art, On the perturbation of pseudo-inverses, projections and li near least squares problems, SIAM Rev., 19 (1977), pp. 634–662

1977
[26]

J. A. Tropp and R. J. Webber , Randomized algorithms for low-rank matrix approximation: Design, analysis, and applications , 2023, https://arxiv.org/abs/2306.12418

work page arXiv 2023
[27]

J. A. Tropp, A. Yurtsever, M. Udell, and V. Cevher , Fixed-rank approximation of a positive-semideﬁnite matrix from streaming data , in Advances in Neural Information Pro- cessing Systems, vol. 30, 2017

2017
[28]

C. K. I. Williams and M. Seeger , Using the Nystr¨ om method to speed up kernel machines , in Advances in Neural Information Processing Systems, vol. 13, 2000

2000
[29]

Zhang , The Schur Complement and Its Applications , vol

F. Zhang , The Schur Complement and Its Applications , vol. 4, Springer, 2006

2006