Sharp analysis of sketched least squares and randomized low-rank approximation

Ethan N. Epperly; Robert J. Webber

arxiv: 2605.19096 · v1 · pith:YJ553JGCnew · submitted 2026-05-18 · 🧮 math.NA · cs.NA· stat.CO

Sharp analysis of sketched least squares and randomized low-rank approximation

Ethan N. Epperly , Robert J. Webber This is my paper

Pith reviewed 2026-05-20 07:22 UTC · model grok-4.3

classification 🧮 math.NA cs.NAstat.CO

keywords sketched least squaresrandomized SVDrandom embeddingsminimax optimalityerror boundslow-rank approximationmatrix sketchingnumerical linear algebra

0 comments

The pith

A random orthonormal matrix is minimax optimal for sketched least squares while rotation-invariant embeddings are optimal for randomized SVD.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper determines the best random embeddings for two common randomized algorithms that compress matrices to reduce computation. For least-squares regression via sketch-and-solve, it shows that a random orthonormal matrix achieves the smallest possible worst-case error. For low-rank approximation via randomized SVD, any rotation-invariant embedding works equally well at optimality. These optimality results deliver the tightest achievable error bounds for both methods. Experiments indicate that many standard embeddings perform nearly as well as the optimal ones in practice.

Core claim

The paper proves that a random orthonormal matrix is minimax optimal for the sketch-and-solve algorithm while any rotation-invariant embedding is minimax optimal for the randomized SVD, and obtains the best possible error bounds for sketched least-squares and the randomized SVD.

What carries the argument

Minimax optimality analysis of random embeddings applied to matrix sketching for least-squares and low-rank approximation.

If this is right

The derived bounds are the smallest possible for any choice of random embedding in these two algorithms.
Practical implementations can safely use orthonormal matrices for sketched regression without sacrificing worst-case guarantees.
Rotation-invariant embeddings such as Gaussian or Rademacher matrices all achieve the same optimal performance for randomized SVD.
Universality observed in experiments suggests that common embeddings will match optimal accuracy on most inputs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The optimality results could extend to other sketching tasks such as kernel approximation or streaming regression.
Knowing the minimax embeddings may simplify theoretical comparisons across different randomized numerical linear algebra methods.
The universality phenomenon raises the question of whether similar near-optimality holds for embeddings outside the rotation-invariant class in finite dimensions.

Load-bearing premise

The error bounds and optimality claims rest on standard random matrix concentration inequalities without extra conditions on matrix coherence.

What would settle it

A concrete matrix and embedding pair where a non-orthonormal sketch produces strictly smaller worst-case least-squares error than an orthonormal one, or where a non-rotation-invariant embedding beats all rotation-invariant ones for SVD approximation.

Figures

Figures reproduced from arXiv: 2605.19096 by Ethan N. Epperly, Robert J. Webber.

**Figure 1.** Figure 1: Accuracy parameter ε = E ° °B − A(Ω∗ A) +(Ω∗B) ° ° 2 F / ° °B − A A+B ° ° 2 F−1 as a function of embedding dimension ℓ for eight random embeddings. All embeddings follow the theoretical predictions of Theorem 2.1 for either the Gaussian matrix or random orthonormal matrix. Theorem 2.2 (Sketch-and-solve: Sharp lower bound). Let Ω ∈ Kn×ℓ be any full-rank random embedding. For any problem dimensions d,p ∈ N… view at source ↗

**Figure 2.** Figure 2: Mean-square error E∥A − Ab∥ 2 F as a function of embedding dimension ℓ for six random embeddings. The error of the optimal rank-ℓ approximation is shown for reference. For the “step” problems, all embeddings follow the new bound (Theorem 3.2), except for sign-based embeddings at small values of ℓ. For the “poly” problems, the bound predicts the mean-square error up to a factor of 2.5×. 3.2 RANDOMIZED SVD:… view at source ↗

read the original abstract

Two widely used randomized algorithms are the sketch-and-solve method for least-squares regression and the randomized SVD for low-rank approximation. These algorithms apply a random embedding to compress a target matrix, and they perform computations on the compressed matrix to save computational cost. This paper asks, what is the optimal random embedding in these algorithms? Also, what is the sharpest possible error bound for the optimal embedding? The paper proves that a random orthonormal matrix is minimax optimal for the sketch-and-solve algorithm while any rotation-invariant embedding is minimax optimal for the randomized SVD. Following these results, the paper obtains the best possible error bounds for sketched least-squares and the randomized SVD. Last, empirical experiments provide evidence of universality phenomena, in which several random embeddings lead to similar accuracy to the optimal embeddings in practice.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper proves minimax optimality for random orthonormal embeddings in sketched least squares and rotation-invariant ones in randomized SVD, with matching sharp error bounds that hold under standard assumptions.

read the letter

The main thing to know is that this paper pins down the optimal random embeddings for sketch-and-solve least squares and randomized SVD, then derives the tightest error bounds possible under the usual model. They show a random orthonormal matrix is minimax optimal for the least-squares case, while any rotation-invariant embedding works for the SVD, and they match upper and lower bounds explicitly rather than stopping at one-sided estimates. The upper bounds come from Johnson-Lindenstrauss-style concentration on sub-Gaussian rows, and the lower bounds use explicit constructions that achieve the same constants. The sketch-size requirements are the standard ones: m on the order of n plus log terms for least squares and k plus log terms for SVD, with no extra coherence conditions needed. That setup lets them claim the best possible constants, which improves on earlier analyses that left some slack in the bounds. The work stays grounded in the existing randomized NLA framework and delivers reproducible-style sharpness without fitting parameters or relying on self-citations for the core claims. The universality experiments are only supporting evidence and not part of the formal results, so they add little if your interest is purely theoretical. The paper does not compare runtime costs of the optimal embeddings against simpler ones like Gaussians, but that is a minor gap given the focus on error analysis. This is for readers who need precise constants when applying these algorithms in large-scale regression or low-rank approximation, such as in scientific computing or data science. The math is clear and the claims line up with the stated regimes, so it deserves a serious referee. I would send it out for peer review.

Referee Report

0 major / 2 minor

Summary. The manuscript proves that a random orthonormal matrix is minimax optimal for the sketch-and-solve algorithm applied to least-squares regression, while any rotation-invariant embedding is minimax optimal for the randomized SVD. It derives the sharpest possible error bounds for both settings by matching explicit lower-bound constructions to upper bounds obtained from Johnson-Lindenstrauss-type concentration, under the regimes m = Ω(n + log(1/δ)) for least squares and m = Ω(k + log(1/δ)) for SVD, with no additional coherence assumptions beyond standard sub-Gaussian row norms. Empirical experiments are included as supporting evidence for universality phenomena across embeddings.

Significance. If the results hold, the paper makes a valuable contribution to randomized numerical linear algebra by supplying the first sharp, minimax-optimal characterizations of embeddings together with matching upper and lower bounds. The explicit lower-bound constructions for the sketch-and-solve case and the invariance reduction for randomized SVD are strengths that deliver parameter-free, falsifiable guarantees under standard assumptions.

minor comments (2)

[Introduction] The introduction would benefit from a single consolidated statement of the precise sketch-size regimes (m = Ω(n + log(1/δ)) and m = Ω(k + log(1/δ))) required for the sharp constants to hold with high probability.
[Experiments] In the experimental section, a short remark clarifying that the universality plots are illustrative rather than part of the formal claims would prevent any misreading of their role.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. We are pleased that the contributions on minimax optimality for random orthonormal matrices in sketched least squares and rotation-invariant embeddings in randomized SVD are recognized, along with the matching sharp bounds under the stated regimes.

Circularity Check

0 steps flagged

No significant circularity; derivation uses independent minimax lower bounds and concentration inequalities

full rationale

The paper establishes minimax optimality for random orthonormal embeddings in sketch-and-solve and rotation-invariant embeddings in randomized SVD by deriving matching upper bounds from standard sub-Gaussian concentration (Johnson-Lindenstrauss type) and explicit lower-bound constructions based on invariance under orthogonal transformations. These steps are self-contained within the stated assumptions on sketch dimension m = Ω(n + log(1/δ)) or m = Ω(k + log(1/δ)) and do not reduce to fitted parameters, self-definitions, or load-bearing self-citations. The universality experiments are explicitly separated as supporting evidence rather than part of the formal claims. The central results therefore rest on external probabilistic tools and direct constructions rather than circular reduction to the paper's own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The work rests on standard assumptions from random matrix theory and minimax analysis in numerical linear algebra; no free parameters, invented entities, or ad-hoc axioms are indicated in the abstract.

pith-pipeline@v0.9.0 · 5665 in / 1096 out tokens · 34860 ms · 2026-05-20T07:22:13.906863+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages · 2 internal anchors

[1]

The spectral norm error of the naive Nystrom extension

Gittens, Alex , year =. The Spectral Norm Error of the Na. arXiv:1110.5305 [cs, math] , eprint =

work page internal anchor Pith review Pith/arXiv arXiv
[2]

Practical sketching algorithms for low-rank matrix approximation

Practical Sketching Algorithms for Low-Rank Matrix Approximation , author =. 2017 , month = jan, journal =. doi:10.1137/17M1111590 , urldate =. arXiv , keywords =:1609.00048 , pages =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1137/17m1111590 2017
[3]

In: Proceed- ings of the Forty-First Annual ACM Symposium on Theory of Computing

Clarkson, Kenneth L. and Woodruff, David P. , year =. Numerical Linear Algebra in the Streaming Model , booktitle =. doi:10.1145/1536414.1536445 , urldate =

work page doi:10.1145/1536414.1536445
[4]

Nakatsukasa , Fast and stable randomized low-rank matrix approximation , arXiv:2009.11392 [math.NA], (2020)

Fast and Stable Randomized Low-Rank Matrix Approximation , author =. 2020 , month = sep, number =. arXiv , keywords =:2009.11392 , primaryclass =

work page arXiv 2020
[5]

Murray, J

Murray, Riley and Demmel, James and Mahoney, Michael W. and Erichson, N. Benjamin and Melnichenko, Maksim and Malik, Osman Asif and Grigori, Laura and Luszczek, Piotr and Derezi. Randomized. 2023 , month = apr, number =. doi:10.48550/arXiv.2302.11474 , urldate =. arXiv , keywords =:2302.11474 , primaryclass =

work page doi:10.48550/arxiv.2302.11474 2023
[6]

Simpler Is Better: A Comparative Study of Randomized Pivoting Algorithms for

Dong, Yijun and Martinsson, Per-Gunnar , year =. Simpler Is Better: A Comparative Study of Randomized Pivoting Algorithms for. Advances in Computational Mathematics , volume =. doi:10.1007/s10444-023-10061-z , urldate =

work page doi:10.1007/s10444-023-10061-z
[7]

2008 , month = nov, journal =

A Fast Randomized Algorithm for the Approximation of Matrices , author =. 2008 , month = nov, journal =. doi:10.1016/j.acha.2007.12.002 , urldate =

work page doi:10.1016/j.acha.2007.12.002 2008
[8]

2006 , series =

The. 2006 , series =

work page 2006
[9]

and Muthukrishnan, S

Drineas, Petros and Mahoney, Michael W. and Muthukrishnan, S. , booktitle =. Sampling Algorithms for ^2 Regression and Applications , year =

work page
[10]

Improved

Sarl. Improved. 2006 47th. 2006 , doi =

work page 2006
[11]

Distributed sketching methods for privacy preserving regression,

Bartan, Burak and Pilanci, Mert , month = jun, title =. arXiv preprint arXiv:2002.06538 , year =

work page arXiv 2002
[12]

, journal =

Martinsson, Per-Gunnar and Tropp, Joel A. , journal =. Randomized Numerical Linear Algebra:. 2020 , doi =

work page 2020
[13]

Near-Optimal Hierarchical Matrix Approximation from Matrix-Vector Products , booktitle =

Chen, Tyler and Duman Keles, Feyza and Halikias, Diana and Musco, Cameron and Musco, Christopher and Persson, David , year = 2025, pages =. Near-Optimal Hierarchical Matrix Approximation from Matrix-Vector Products , booktitle =

work page 2025
[14]

Sridhar, Srivatsan and Pilanci, Mert and. Lower. 2020 , month = nov, journal =. doi:10.1109/JSAIT.2020.3039509 , urldate =

work page doi:10.1109/jsait.2020.3039509 2020
[15]

2021 , month = jun, journal =

Statistical Properties of Sketching Algorithms , author =. 2021 , month = jun, journal =. doi:10.1093/biomet/asaa062 , urldate =

work page doi:10.1093/biomet/asaa062 2021
[16]

and Yurtsever, Alp and Udell, Madeleine and Cevher, Volkan , journal =

Tropp, Joel A. and Yurtsever, Alp and Udell, Madeleine and Cevher, Volkan , journal =. Streaming. 2019 , doi =

work page 2019
[17]

Algorithmic

Derezi. Algorithmic. The. 2023 , url =

work page 2023
[18]

Cama. Faster. arXiv preprint arXiv:2508.21189 , publisher =. 2025 , doi =

work page arXiv 2025
[19]

Asymptotics for

Dobriban, Edgar and Liu, Sifan , booktitle =. Asymptotics for

work page
[20]

Chenakkod, Shabarish and Derezi. Optimal. Proceedings of the 56th. 2024 , doi =

work page 2024
[21]

, month = feb, title =

Kireeva, Anastasia and Tropp, Joel A. , month = feb, title =. 2024 , doi =

work page 2024
[22]

, school =

Epperly, Ethan N. , school =. Make the Most of What You Have:. 2025 , doi =

work page 2025
[23]

and Muthukrishnan, S

Drineas, Petros and Mahoney, Michael W. and Muthukrishnan, S. and Sarl. Faster Least Squares Approximation , volume =. Numerische Mathematik , month = feb, number =. 2011 , doi =

work page 2011
[24]

Distributed Sketching for Randomized Optimization: Exact Characterization, Concentration, and Lower Bounds , volume =

Bartan, Burak and Pilanci, Mert , journal =. Distributed Sketching for Randomized Optimization: Exact Characterization, Concentration, and Lower Bounds , volume =. 2023 , doi =

work page 2023
[25]

Lacotte, Jonathan and Liu, Sifan and Dobriban, Edgar and Pilanci, Mert , booktitle =. Optimal

work page
[26]

, journal =

Halko, Nathan and Martinsson, Per-Gunnar and Tropp, Joel A. , journal =. Finding Structure with Randomness:. 2011 , doi =

work page 2011
[27]

and Webber, Robert J

Tropp, Joel A. and Webber, Robert J. , journal =. Randomized Algorithms for Low-Rank Matrix Approximation: Design, Analysis, and Applictions , year =

work page
[28]

Randomized Algorithms for Low-Rank Matrix Factorizations:

Witten, Rafi and Candès, Emmanuel , journal =. Randomized Algorithms for Low-Rank Matrix Factorizations:. 2014 , doi =

work page 2014
[29]

The Inverted Complex

Shaman, Paul , journal =. The Inverted Complex. 1980 , doi =

work page 1980
[30]

, publisher =

Muirhead, Robb J. , publisher =. Aspects of. 1982 , isbn =

work page 1982
[31]

Mitra, Sujit Kumar , journal =. A. 1970 , issn =

work page 1970
[32]

Positive Definite Matrices , year =

Bhatia, Rajendra , publisher =. Positive Definite Matrices , year =

work page
[33]

Williams, Christopher K. I. and Seeger, Matthias , booktitle =. Using the

work page
[34]

, journal =

Drineas, Petros and Mahoney, Michael W. , journal =. On the. 2005 , issn =

work page 2005
[35]

Revisiting the

Gittens, Alex and Mahoney, Michael , booktitle =. Revisiting the. 2013 , issn =

work page 2013
[36]

and Yurtsever, Alp and Udell, Madeleine and Cevher, Volkan , booktitle =

Tropp, Joel A. and Yurtsever, Alp and Udell, Madeleine and Cevher, Volkan , booktitle =. Fixed-Rank Approximation of a Positive-Semidefinite Matrix from Streaming Data , volume =

work page
[37]

Ando, Tsuyoshi , booktitle =. Schur. 2005 , doi =

work page 2005

[1] [1]

The spectral norm error of the naive Nystrom extension

Gittens, Alex , year =. The Spectral Norm Error of the Na. arXiv:1110.5305 [cs, math] , eprint =

work page internal anchor Pith review Pith/arXiv arXiv

[2] [2]

Practical sketching algorithms for low-rank matrix approximation

Practical Sketching Algorithms for Low-Rank Matrix Approximation , author =. 2017 , month = jan, journal =. doi:10.1137/17M1111590 , urldate =. arXiv , keywords =:1609.00048 , pages =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1137/17m1111590 2017

[3] [3]

In: Proceed- ings of the Forty-First Annual ACM Symposium on Theory of Computing

Clarkson, Kenneth L. and Woodruff, David P. , year =. Numerical Linear Algebra in the Streaming Model , booktitle =. doi:10.1145/1536414.1536445 , urldate =

work page doi:10.1145/1536414.1536445

[4] [4]

Nakatsukasa , Fast and stable randomized low-rank matrix approximation , arXiv:2009.11392 [math.NA], (2020)

Fast and Stable Randomized Low-Rank Matrix Approximation , author =. 2020 , month = sep, number =. arXiv , keywords =:2009.11392 , primaryclass =

work page arXiv 2020

[5] [5]

Murray, J

Murray, Riley and Demmel, James and Mahoney, Michael W. and Erichson, N. Benjamin and Melnichenko, Maksim and Malik, Osman Asif and Grigori, Laura and Luszczek, Piotr and Derezi. Randomized. 2023 , month = apr, number =. doi:10.48550/arXiv.2302.11474 , urldate =. arXiv , keywords =:2302.11474 , primaryclass =

work page doi:10.48550/arxiv.2302.11474 2023

[6] [6]

Simpler Is Better: A Comparative Study of Randomized Pivoting Algorithms for

Dong, Yijun and Martinsson, Per-Gunnar , year =. Simpler Is Better: A Comparative Study of Randomized Pivoting Algorithms for. Advances in Computational Mathematics , volume =. doi:10.1007/s10444-023-10061-z , urldate =

work page doi:10.1007/s10444-023-10061-z

[7] [7]

2008 , month = nov, journal =

A Fast Randomized Algorithm for the Approximation of Matrices , author =. 2008 , month = nov, journal =. doi:10.1016/j.acha.2007.12.002 , urldate =

work page doi:10.1016/j.acha.2007.12.002 2008

[8] [8]

2006 , series =

The. 2006 , series =

work page 2006

[9] [9]

and Muthukrishnan, S

Drineas, Petros and Mahoney, Michael W. and Muthukrishnan, S. , booktitle =. Sampling Algorithms for ^2 Regression and Applications , year =

work page

[10] [10]

Improved

Sarl. Improved. 2006 47th. 2006 , doi =

work page 2006

[11] [11]

Distributed sketching methods for privacy preserving regression,

Bartan, Burak and Pilanci, Mert , month = jun, title =. arXiv preprint arXiv:2002.06538 , year =

work page arXiv 2002

[12] [12]

, journal =

Martinsson, Per-Gunnar and Tropp, Joel A. , journal =. Randomized Numerical Linear Algebra:. 2020 , doi =

work page 2020

[13] [13]

Near-Optimal Hierarchical Matrix Approximation from Matrix-Vector Products , booktitle =

Chen, Tyler and Duman Keles, Feyza and Halikias, Diana and Musco, Cameron and Musco, Christopher and Persson, David , year = 2025, pages =. Near-Optimal Hierarchical Matrix Approximation from Matrix-Vector Products , booktitle =

work page 2025

[14] [14]

Sridhar, Srivatsan and Pilanci, Mert and. Lower. 2020 , month = nov, journal =. doi:10.1109/JSAIT.2020.3039509 , urldate =

work page doi:10.1109/jsait.2020.3039509 2020

[15] [15]

2021 , month = jun, journal =

Statistical Properties of Sketching Algorithms , author =. 2021 , month = jun, journal =. doi:10.1093/biomet/asaa062 , urldate =

work page doi:10.1093/biomet/asaa062 2021

[16] [16]

and Yurtsever, Alp and Udell, Madeleine and Cevher, Volkan , journal =

Tropp, Joel A. and Yurtsever, Alp and Udell, Madeleine and Cevher, Volkan , journal =. Streaming. 2019 , doi =

work page 2019

[17] [17]

Algorithmic

Derezi. Algorithmic. The. 2023 , url =

work page 2023

[18] [18]

Cama. Faster. arXiv preprint arXiv:2508.21189 , publisher =. 2025 , doi =

work page arXiv 2025

[19] [19]

Asymptotics for

Dobriban, Edgar and Liu, Sifan , booktitle =. Asymptotics for

work page

[20] [20]

Chenakkod, Shabarish and Derezi. Optimal. Proceedings of the 56th. 2024 , doi =

work page 2024

[21] [21]

, month = feb, title =

Kireeva, Anastasia and Tropp, Joel A. , month = feb, title =. 2024 , doi =

work page 2024

[22] [22]

, school =

Epperly, Ethan N. , school =. Make the Most of What You Have:. 2025 , doi =

work page 2025

[23] [23]

and Muthukrishnan, S

Drineas, Petros and Mahoney, Michael W. and Muthukrishnan, S. and Sarl. Faster Least Squares Approximation , volume =. Numerische Mathematik , month = feb, number =. 2011 , doi =

work page 2011

[24] [24]

Distributed Sketching for Randomized Optimization: Exact Characterization, Concentration, and Lower Bounds , volume =

Bartan, Burak and Pilanci, Mert , journal =. Distributed Sketching for Randomized Optimization: Exact Characterization, Concentration, and Lower Bounds , volume =. 2023 , doi =

work page 2023

[25] [25]

Lacotte, Jonathan and Liu, Sifan and Dobriban, Edgar and Pilanci, Mert , booktitle =. Optimal

work page

[26] [26]

, journal =

Halko, Nathan and Martinsson, Per-Gunnar and Tropp, Joel A. , journal =. Finding Structure with Randomness:. 2011 , doi =

work page 2011

[27] [27]

and Webber, Robert J

Tropp, Joel A. and Webber, Robert J. , journal =. Randomized Algorithms for Low-Rank Matrix Approximation: Design, Analysis, and Applictions , year =

work page

[28] [28]

Randomized Algorithms for Low-Rank Matrix Factorizations:

Witten, Rafi and Candès, Emmanuel , journal =. Randomized Algorithms for Low-Rank Matrix Factorizations:. 2014 , doi =

work page 2014

[29] [29]

The Inverted Complex

Shaman, Paul , journal =. The Inverted Complex. 1980 , doi =

work page 1980

[30] [30]

, publisher =

Muirhead, Robb J. , publisher =. Aspects of. 1982 , isbn =

work page 1982

[31] [31]

Mitra, Sujit Kumar , journal =. A. 1970 , issn =

work page 1970

[32] [32]

Positive Definite Matrices , year =

Bhatia, Rajendra , publisher =. Positive Definite Matrices , year =

work page

[33] [33]

Williams, Christopher K. I. and Seeger, Matthias , booktitle =. Using the

work page

[34] [34]

, journal =

Drineas, Petros and Mahoney, Michael W. , journal =. On the. 2005 , issn =

work page 2005

[35] [35]

Revisiting the

Gittens, Alex and Mahoney, Michael , booktitle =. Revisiting the. 2013 , issn =

work page 2013

[36] [36]

and Yurtsever, Alp and Udell, Madeleine and Cevher, Volkan , booktitle =

Tropp, Joel A. and Yurtsever, Alp and Udell, Madeleine and Cevher, Volkan , booktitle =. Fixed-Rank Approximation of a Positive-Semidefinite Matrix from Streaming Data , volume =

work page

[37] [37]

Ando, Tsuyoshi , booktitle =. Schur. 2005 , doi =

work page 2005