pith. sign in

arxiv: 2605.19096 · v1 · pith:YJ553JGCnew · submitted 2026-05-18 · 🧮 math.NA · cs.NA· stat.CO

Sharp analysis of sketched least squares and randomized low-rank approximation

Pith reviewed 2026-05-20 07:22 UTC · model grok-4.3

classification 🧮 math.NA cs.NAstat.CO
keywords sketched least squaresrandomized SVDrandom embeddingsminimax optimalityerror boundslow-rank approximationmatrix sketchingnumerical linear algebra
0
0 comments X

The pith

A random orthonormal matrix is minimax optimal for sketched least squares while rotation-invariant embeddings are optimal for randomized SVD.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper determines the best random embeddings for two common randomized algorithms that compress matrices to reduce computation. For least-squares regression via sketch-and-solve, it shows that a random orthonormal matrix achieves the smallest possible worst-case error. For low-rank approximation via randomized SVD, any rotation-invariant embedding works equally well at optimality. These optimality results deliver the tightest achievable error bounds for both methods. Experiments indicate that many standard embeddings perform nearly as well as the optimal ones in practice.

Core claim

The paper proves that a random orthonormal matrix is minimax optimal for the sketch-and-solve algorithm while any rotation-invariant embedding is minimax optimal for the randomized SVD, and obtains the best possible error bounds for sketched least-squares and the randomized SVD.

What carries the argument

Minimax optimality analysis of random embeddings applied to matrix sketching for least-squares and low-rank approximation.

If this is right

  • The derived bounds are the smallest possible for any choice of random embedding in these two algorithms.
  • Practical implementations can safely use orthonormal matrices for sketched regression without sacrificing worst-case guarantees.
  • Rotation-invariant embeddings such as Gaussian or Rademacher matrices all achieve the same optimal performance for randomized SVD.
  • Universality observed in experiments suggests that common embeddings will match optimal accuracy on most inputs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The optimality results could extend to other sketching tasks such as kernel approximation or streaming regression.
  • Knowing the minimax embeddings may simplify theoretical comparisons across different randomized numerical linear algebra methods.
  • The universality phenomenon raises the question of whether similar near-optimality holds for embeddings outside the rotation-invariant class in finite dimensions.

Load-bearing premise

The error bounds and optimality claims rest on standard random matrix concentration inequalities without extra conditions on matrix coherence.

What would settle it

A concrete matrix and embedding pair where a non-orthonormal sketch produces strictly smaller worst-case least-squares error than an orthonormal one, or where a non-rotation-invariant embedding beats all rotation-invariant ones for SVD approximation.

Figures

Figures reproduced from arXiv: 2605.19096 by Ethan N. Epperly, Robert J. Webber.

Figure 1
Figure 1. Figure 1: Accuracy parameter ε = E ° °B − A(Ω∗ A) +(Ω∗B) ° ° 2 F / ° °B − A A+B ° ° 2 F−1 as a function of embedding dimension ℓ for eight random embeddings. All embeddings follow the theoretical predictions of Theo￾rem 2.1 for either the Gaussian matrix or random orthonormal matrix. Theorem 2.2 (Sketch-and-solve: Sharp lower bound). Let Ω ∈ Kn×ℓ be any full-rank random embed￾ding. For any problem dimensions d,p ∈ N… view at source ↗
Figure 2
Figure 2. Figure 2: Mean-square error E∥A − Ab∥ 2 F as a function of embedding dimension ℓ for six random embed￾dings. The error of the optimal rank-ℓ approximation is shown for reference. For the “step” problems, all embeddings follow the new bound (Theorem 3.2), except for sign-based embeddings at small values of ℓ. For the “poly” problems, the bound predicts the mean-square error up to a factor of 2.5×. 3.2 RANDOMIZED SVD:… view at source ↗
read the original abstract

Two widely used randomized algorithms are the sketch-and-solve method for least-squares regression and the randomized SVD for low-rank approximation. These algorithms apply a random embedding to compress a target matrix, and they perform computations on the compressed matrix to save computational cost. This paper asks, what is the optimal random embedding in these algorithms? Also, what is the sharpest possible error bound for the optimal embedding? The paper proves that a random orthonormal matrix is minimax optimal for the sketch-and-solve algorithm while any rotation-invariant embedding is minimax optimal for the randomized SVD. Following these results, the paper obtains the best possible error bounds for sketched least-squares and the randomized SVD. Last, empirical experiments provide evidence of universality phenomena, in which several random embeddings lead to similar accuracy to the optimal embeddings in practice.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript proves that a random orthonormal matrix is minimax optimal for the sketch-and-solve algorithm applied to least-squares regression, while any rotation-invariant embedding is minimax optimal for the randomized SVD. It derives the sharpest possible error bounds for both settings by matching explicit lower-bound constructions to upper bounds obtained from Johnson-Lindenstrauss-type concentration, under the regimes m = Ω(n + log(1/δ)) for least squares and m = Ω(k + log(1/δ)) for SVD, with no additional coherence assumptions beyond standard sub-Gaussian row norms. Empirical experiments are included as supporting evidence for universality phenomena across embeddings.

Significance. If the results hold, the paper makes a valuable contribution to randomized numerical linear algebra by supplying the first sharp, minimax-optimal characterizations of embeddings together with matching upper and lower bounds. The explicit lower-bound constructions for the sketch-and-solve case and the invariance reduction for randomized SVD are strengths that deliver parameter-free, falsifiable guarantees under standard assumptions.

minor comments (2)
  1. [Introduction] The introduction would benefit from a single consolidated statement of the precise sketch-size regimes (m = Ω(n + log(1/δ)) and m = Ω(k + log(1/δ))) required for the sharp constants to hold with high probability.
  2. [Experiments] In the experimental section, a short remark clarifying that the universality plots are illustrative rather than part of the formal claims would prevent any misreading of their role.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. We are pleased that the contributions on minimax optimality for random orthonormal matrices in sketched least squares and rotation-invariant embeddings in randomized SVD are recognized, along with the matching sharp bounds under the stated regimes.

Circularity Check

0 steps flagged

No significant circularity; derivation uses independent minimax lower bounds and concentration inequalities

full rationale

The paper establishes minimax optimality for random orthonormal embeddings in sketch-and-solve and rotation-invariant embeddings in randomized SVD by deriving matching upper bounds from standard sub-Gaussian concentration (Johnson-Lindenstrauss type) and explicit lower-bound constructions based on invariance under orthogonal transformations. These steps are self-contained within the stated assumptions on sketch dimension m = Ω(n + log(1/δ)) or m = Ω(k + log(1/δ)) and do not reduce to fitted parameters, self-definitions, or load-bearing self-citations. The universality experiments are explicitly separated as supporting evidence rather than part of the formal claims. The central results therefore rest on external probabilistic tools and direct constructions rather than circular reduction to the paper's own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The work rests on standard assumptions from random matrix theory and minimax analysis in numerical linear algebra; no free parameters, invented entities, or ad-hoc axioms are indicated in the abstract.

pith-pipeline@v0.9.0 · 5665 in / 1096 out tokens · 34860 ms · 2026-05-20T07:22:13.906863+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages · 2 internal anchors

  1. [1]

    The spectral norm error of the naive Nystrom extension

    Gittens, Alex , year =. The Spectral Norm Error of the Na. arXiv:1110.5305 [cs, math] , eprint =

  2. [2]

    Practical sketching algorithms for low-rank matrix approximation

    Practical Sketching Algorithms for Low-Rank Matrix Approximation , author =. 2017 , month = jan, journal =. doi:10.1137/17M1111590 , urldate =. arXiv , keywords =:1609.00048 , pages =

  3. [3]

    In: Proceed- ings of the Forty-First Annual ACM Symposium on Theory of Computing

    Clarkson, Kenneth L. and Woodruff, David P. , year =. Numerical Linear Algebra in the Streaming Model , booktitle =. doi:10.1145/1536414.1536445 , urldate =

  4. [4]

    Nakatsukasa , Fast and stable randomized low-rank matrix approximation , arXiv:2009.11392 [math.NA], (2020)

    Fast and Stable Randomized Low-Rank Matrix Approximation , author =. 2020 , month = sep, number =. arXiv , keywords =:2009.11392 , primaryclass =

  5. [5]

    Murray, J

    Murray, Riley and Demmel, James and Mahoney, Michael W. and Erichson, N. Benjamin and Melnichenko, Maksim and Malik, Osman Asif and Grigori, Laura and Luszczek, Piotr and Derezi. Randomized. 2023 , month = apr, number =. doi:10.48550/arXiv.2302.11474 , urldate =. arXiv , keywords =:2302.11474 , primaryclass =

  6. [6]

    Simpler Is Better: A Comparative Study of Randomized Pivoting Algorithms for

    Dong, Yijun and Martinsson, Per-Gunnar , year =. Simpler Is Better: A Comparative Study of Randomized Pivoting Algorithms for. Advances in Computational Mathematics , volume =. doi:10.1007/s10444-023-10061-z , urldate =

  7. [7]

    2008 , month = nov, journal =

    A Fast Randomized Algorithm for the Approximation of Matrices , author =. 2008 , month = nov, journal =. doi:10.1016/j.acha.2007.12.002 , urldate =

  8. [8]

    2006 , series =

    The. 2006 , series =

  9. [9]

    and Muthukrishnan, S

    Drineas, Petros and Mahoney, Michael W. and Muthukrishnan, S. , booktitle =. Sampling Algorithms for ^2 Regression and Applications , year =

  10. [10]

    Improved

    Sarl. Improved. 2006 47th. 2006 , doi =

  11. [11]

    Distributed sketching methods for privacy preserving regression,

    Bartan, Burak and Pilanci, Mert , month = jun, title =. arXiv preprint arXiv:2002.06538 , year =

  12. [12]

    , journal =

    Martinsson, Per-Gunnar and Tropp, Joel A. , journal =. Randomized Numerical Linear Algebra:. 2020 , doi =

  13. [13]

    Near-Optimal Hierarchical Matrix Approximation from Matrix-Vector Products , booktitle =

    Chen, Tyler and Duman Keles, Feyza and Halikias, Diana and Musco, Cameron and Musco, Christopher and Persson, David , year = 2025, pages =. Near-Optimal Hierarchical Matrix Approximation from Matrix-Vector Products , booktitle =

  14. [14]

    Sridhar, Srivatsan and Pilanci, Mert and. Lower. 2020 , month = nov, journal =. doi:10.1109/JSAIT.2020.3039509 , urldate =

  15. [15]

    2021 , month = jun, journal =

    Statistical Properties of Sketching Algorithms , author =. 2021 , month = jun, journal =. doi:10.1093/biomet/asaa062 , urldate =

  16. [16]

    and Yurtsever, Alp and Udell, Madeleine and Cevher, Volkan , journal =

    Tropp, Joel A. and Yurtsever, Alp and Udell, Madeleine and Cevher, Volkan , journal =. Streaming. 2019 , doi =

  17. [17]

    Algorithmic

    Derezi. Algorithmic. The. 2023 , url =

  18. [18]

    Cama. Faster. arXiv preprint arXiv:2508.21189 , publisher =. 2025 , doi =

  19. [19]

    Asymptotics for

    Dobriban, Edgar and Liu, Sifan , booktitle =. Asymptotics for

  20. [20]

    Chenakkod, Shabarish and Derezi. Optimal. Proceedings of the 56th. 2024 , doi =

  21. [21]

    , month = feb, title =

    Kireeva, Anastasia and Tropp, Joel A. , month = feb, title =. 2024 , doi =

  22. [22]

    , school =

    Epperly, Ethan N. , school =. Make the Most of What You Have:. 2025 , doi =

  23. [23]

    and Muthukrishnan, S

    Drineas, Petros and Mahoney, Michael W. and Muthukrishnan, S. and Sarl. Faster Least Squares Approximation , volume =. Numerische Mathematik , month = feb, number =. 2011 , doi =

  24. [24]

    Distributed Sketching for Randomized Optimization: Exact Characterization, Concentration, and Lower Bounds , volume =

    Bartan, Burak and Pilanci, Mert , journal =. Distributed Sketching for Randomized Optimization: Exact Characterization, Concentration, and Lower Bounds , volume =. 2023 , doi =

  25. [25]

    Lacotte, Jonathan and Liu, Sifan and Dobriban, Edgar and Pilanci, Mert , booktitle =. Optimal

  26. [26]

    , journal =

    Halko, Nathan and Martinsson, Per-Gunnar and Tropp, Joel A. , journal =. Finding Structure with Randomness:. 2011 , doi =

  27. [27]

    and Webber, Robert J

    Tropp, Joel A. and Webber, Robert J. , journal =. Randomized Algorithms for Low-Rank Matrix Approximation: Design, Analysis, and Applictions , year =

  28. [28]

    Randomized Algorithms for Low-Rank Matrix Factorizations:

    Witten, Rafi and Candès, Emmanuel , journal =. Randomized Algorithms for Low-Rank Matrix Factorizations:. 2014 , doi =

  29. [29]

    The Inverted Complex

    Shaman, Paul , journal =. The Inverted Complex. 1980 , doi =

  30. [30]

    , publisher =

    Muirhead, Robb J. , publisher =. Aspects of. 1982 , isbn =

  31. [31]

    Mitra, Sujit Kumar , journal =. A. 1970 , issn =

  32. [32]

    Positive Definite Matrices , year =

    Bhatia, Rajendra , publisher =. Positive Definite Matrices , year =

  33. [33]

    Williams, Christopher K. I. and Seeger, Matthias , booktitle =. Using the

  34. [34]

    , journal =

    Drineas, Petros and Mahoney, Michael W. , journal =. On the. 2005 , issn =

  35. [35]

    Revisiting the

    Gittens, Alex and Mahoney, Michael , booktitle =. Revisiting the. 2013 , issn =

  36. [36]

    and Yurtsever, Alp and Udell, Madeleine and Cevher, Volkan , booktitle =

    Tropp, Joel A. and Yurtsever, Alp and Udell, Madeleine and Cevher, Volkan , booktitle =. Fixed-Rank Approximation of a Positive-Semidefinite Matrix from Streaming Data , volume =

  37. [37]

    Ando, Tsuyoshi , booktitle =. Schur. 2005 , doi =