pith. sign in

arxiv: 2411.11728 · v2 · submitted 2024-11-18 · 📊 stat.ME

Davis-Kahan Theorem in the two-to-infinity norm and its application to perfect clustering

Pith reviewed 2026-05-23 16:59 UTC · model grok-4.3

classification 📊 stat.ME
keywords Davis-Kahan theoremtwo-to-infinity normeigenvector perturbationperfect clusteringerror boundsspectral clusteringperturbation analysis
0
0 comments X

The pith

Upper bounds on eigenvector deviations in the two-to-infinity norm hold under varied error assumptions and yield conditions for perfect clustering.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper assembles techniques to upper-bound the two-to-infinity norm distance between population eigenvectors U and sample versions hat U. Bounds are first obtained with no or only mild conditions on the error matrix, then sharpened when the errors obey generic probabilistic assumptions. Separate corrections are supplied when the errors are heavy-tailed or decay exponentially fast. The resulting bounds are applied to derive sufficient conditions under which a generic clustering procedure recovers the true partition with no errors.

Core claim

Variants of the Davis-Kahan theorem produce explicit upper bounds on the two-to-infinity norm of U minus hat U across several regimes of the perturbation matrix, from deterministic or mild probabilistic settings through heavy-tailed and exponentially decaying errors, and these bounds translate directly into sufficient conditions for perfect clustering in a general statistical model.

What carries the argument

Two-to-infinity norm Davis-Kahan bounds that control the maximum absolute entry in any row of the eigenvector difference matrix.

If this is right

  • Bounds hold initially with no or mild probabilistic assumptions on the error.
  • Tighter bounds follow once generic probabilistic assumptions on the errors are added.
  • Separate corrections apply to heavy-tailed errors and to exponentially fast decaying errors.
  • The bounds supply sufficient conditions for perfect clustering in a generic setting.
  • Alternative constructions of hat U can be compared directly for accuracy.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same bounding techniques could be inserted into analyses of eigenvector estimation in principal component analysis or matrix completion.
  • The clustering conditions could be turned into practical finite-sample checks by estimating the eigenvalue gap and error size from data.
  • Different choices among the suggested estimators for hat U might produce strictly smaller two-to-infinity errors in particular applications.

Load-bearing premise

The population matrix must have an eigenvalue gap, and the error matrix must meet the technical requirements that let the Davis-Kahan variants apply.

What would settle it

A concrete matrix whose population version has a verified eigenvalue gap but whose sample eigenvectors exceed the derived two-to-infinity bound under the matching error conditions would falsify the claims.

read the original abstract

Many statistical applications, such as the Principal Component Analysis, matrix completion, tensor regression and many others, rely on accurate estimation of leading eigenvectors of a matrix. The Davis-Kahan theorem is known to be instrumental for bounding above the distances between matrices $U$ and $\widehat{U}$ of population eigenvectors and their sample versions. While those distances can be measured in various metrics, the recent developments have shown advantages of evaluation of the deviation in the two-to-infinity norm. The purpose of this paper is to develop a toolbox for derivation of upper bounds for the distances between $U$ and $\widehat{U}$ in the two-to-infinity norm for a variety of possible scenarios. Although this problem has been studied by several authors, the difference between this paper and its predecessors is that the upper bounds are obtained under various sets of assumptions. The upper bounds are initially derived with no or mild probabilistic assumptions on the error, and are subsequently refined, when some generic probabilistic assumptions on the errors hold. The paper also provides rectification of the upper bounds in the cases of heavy-tailed or exponentially fast decaying errors. In addition, the paper suggests alternative methods for evaluation of $\widehat{U}$ and, therefore, enables one to compare the resulting accuracies. As an example of an application of the techniques in the paper, we derive sufficient conditions for perfect clustering in a generic setting, and then employ them in various scenarios.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper develops a toolbox of upper bounds on the two-to-infinity norm distance between population eigenvectors U and sample eigenvectors Ũ under successively stronger assumptions on the perturbation (error) matrix: first with no or mild probabilistic assumptions, then under generic probabilistic assumptions, and with rectifications for heavy-tailed and sub-exponential error regimes. It also proposes alternative estimators for Ũ and derives sufficient conditions for perfect clustering as an application, relying on an eigenvalue gap in the population matrix and suitable control on the error matrix to apply a Davis-Kahan variant.

Significance. If the derivations hold, the work supplies a systematic collection of 2-to-∞ eigenvector perturbation bounds that can be directly applied in PCA, matrix completion, tensor regression, and clustering analyses. The explicit refinement across error regimes and the generic clustering application are useful; the manuscript also supplies alternative Ũ constructions that allow accuracy comparisons.

minor comments (3)
  1. The abstract and introduction would benefit from a short table or enumerated list that explicitly maps each assumption regime (no/mild, generic probabilistic, heavy-tailed, sub-exponential) to the corresponding bound statement and the section where it appears.
  2. Notation for the alternative estimators of Ũ is introduced without a dedicated subsection; a brief comparison paragraph or small table summarizing their respective 2-to-∞ bounds would improve readability.
  3. In the clustering application, the sufficient condition that the bound is smaller than half the eigenvalue gap is stated generically; an explicit numerical illustration with one of the derived bounds would help readers verify the translation from perturbation bound to perfect recovery.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, recognition of the paper's significance, and recommendation of minor revision. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity; derivations rest on external eigenvalue gap and error assumptions

full rationale

The paper develops a toolbox of 2-to-infinity norm bounds by applying variants of the Davis-Kahan theorem under explicit assumption regimes on the perturbation matrix (none/mild, generic probabilistic, heavy-tailed, sub-exponential). These start from a population eigenvalue gap plus error-matrix conditions that are stated as inputs, not fitted or self-defined quantities. The clustering application is a direct, standard consequence of the derived bounds being smaller than half the gap. No self-citation chain, ansatz smuggling, or renaming of known results is load-bearing; the abstract and high-level program describe independent derivations from the invoked theorem variants. This is the common honest case of a self-contained perturbation analysis.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on standard perturbation theory assumptions for the Davis-Kahan framework plus the specific error regimes listed; no free parameters or invented entities are mentioned in the abstract.

axioms (2)
  • domain assumption Existence of a sufficient eigenvalue gap between leading and remaining eigenvalues of the population matrix
    Required to apply any variant of the Davis-Kahan theorem when bounding eigenvector deviations.
  • domain assumption The error matrix satisfies conditions allowing application of the theorem under the successive assumption regimes (mild probabilistic, generic, heavy-tailed, exponential decay)
    Invoked when refining bounds from no assumptions to probabilistic ones and when rectifying for tail behavior.

pith-pipeline@v0.9.0 · 5778 in / 1397 out tokens · 31647 ms · 2026-05-23T16:59:11.037037+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Consistent line clustering using geometric hypergraphs

    math.ST 2025-05 unverdicted novelty 7.0

    Derives information-theoretic recovery thresholds for two intersecting lines with polynomial mass concentration near the intersection and matches them (up to polylog factors) via a spectral algorithm on a hypergraph b...

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages · cited by 1 Pith paper · 1 internal anchor

  1. [1]

    Community detection and stochastic bloc k models: Recent developments

    Emmanuel Abbe. Community detection and stochastic bloc k models: Recent developments. J. Mach. Learn. Res. , 18(177):1–86, 2018

  2. [2]

    , Bandeira , Afonso S

    Emmanuel Abbe, Afonso S. Bandeira, and Georgina Hall. Ex act recovery in the stochastic block model. IEEE Transactions on Information Theory , 62(1):471–487, 2016. ISSN 0018-9448. doi: 10.1109/TIT.2015.2490670

  3. [3]

    An \ ell\_p\ theory of PCA and spectral clustering

    Emmanuel Abbe, Jianqing Fan, and Kaizheng Wang. An ℓp theory of PCA and spectral clustering. The Annals of Statistics , 50(4):2359 – 2385, 2022. doi: 10.1214/22-AOS2196. URL https://doi.org/10.1214/22-AOS2196

  4. [4]

    An Overview of Asymptotic Normality in Stochastic Blockmodels : Cluster Analysis and Inference , May 2023

    Joshua Agterberg and Joshua Cape. An overview of asympto tic normality in stochas- tic blockmodels: Cluster analysis and inference. ArXiv:2305.06353, 2023. URL https://arxiv.org/abs/2305.06353

  5. [5]

    Amini and Elizaveta Levina

    Arash A. Amini and Elizaveta Levina. On semidefinite rela xations for the block model. Ann. Statist., 46(1):149–179, 2018. doi: 10.1214/17-AOS1545

  6. [6]

    Vincent Poor, and Yuxin Chen

    Changxiao Cai, Gen Li, Yuejie Chi, H. Vincent Poor, and Yu xin Chen. Subspace estimation from unbalanced and incomplete data matrices: ℓ2,∞ statistical guarantees. The Annals of Statistics , 49(2):944 – 967, 2021. doi: 10.1214/20-AOS1986. URL https://doi.org/10.1214/20-AOS1986

  7. [7]

    Tony Cai and Anru Zhang

    T. Tony Cai and Anru Zhang. Rate-optimal perturbation bo unds for singular subspaces with applications to high-dimensional statistics. The Annals of Statistics , 46(1):60 – 89, 2018. doi: 10.1214/17-AOS1541. URL https://doi.org/10.1214/17-AOS1541

  8. [8]

    Joshua Cape, Minh Tang, and Carey E. Priebe. The two-to-i nfinity norm and singular subspace geometry with applications to high-dimensional statistic s. The Annals of Statistics , 47(5):2405 – 2439, 2019. doi: 10.1214/18-AOS1752. URL https://doi.org/10.1214/18-AOS1752

  9. [9]

    Sub sampling based community detec- tion for large networks

    Sayan Chakrabarty, Srijan Sengupta, and Yuguo Chen. Sub sampling based community detec- tion for large networks. Statistica Sinica, in press, 2023. doi: 10.5705/ss.202022.0108

  10. [10]

    Infer ence and uncertainty quantification for noisy matrix completion

    Yuxin Chen, Jianqing Fan, Cong Ma, and Yuling Yan. Infer ence and uncertainty quantification for noisy matrix completion. Proceedings of the National Academy of Sciences , 116(46):22931–22937, 2019. doi: 10.1073/pnas.19100531 16. URL https://www.pnas.org/doi/abs/10.1073/pnas.1910053116

  11. [11]

    Asymmetry helps: Eigenvalue and eigenvector analyses of asymmetrically perturbed low-rank matrices

    Yuxin Chen, Chen Cheng, and Jianqing Fan. Asymmetry hel ps: Eigenvalue and eigenvector analyses of asymmetrically perturbed low-rank matrices. The Annals of Statistics , 49(1):435 – 458, 2021. doi: 10.1214/20-AOS1963. URL https://doi.org/10.1214/20-AOS1963

  12. [12]

    Spectral Methods for Data Science : A Statistical Perspective

    Yuxin Chen, Yuejie Chi, Jianqing Fan, and Cong Ma. Spect ral methods for data science: A statistical perspective. Foundations and Trends in Machine Learning , 14(5):566–806, 2021. ISSN 1935-8245. doi: 10.1561/2200000079. URL http://dx.doi.org/10.1561/2200000079

  13. [13]

    Chandler Davis and W. M. Kahan. The rotation of eigenvec tors by a perturbation. iii. SIAM Journal on Numerical Analysis , 7(1):1–46, 1970. doi: 10.1137/0707001. URL https://doi.org/10.1137/0707001. 24

  14. [14]

    Partial recovery bounds for clustering with the relaxed $K$means

    Christophe Giraud and Nicolas Verzelen. Partial recov ery bounds for clustering with the relaxed kmeans. ArXiv:1807.07547, 2018

  15. [15]

    Gower and Garmt B

    John C. Gower and Garmt B. Dijksterhuis. Procrustes problems, volume 30 of Oxford Statistical Science Series . Oxford University Press, Oxford, UK, January 2004. ISBN 01 98510586. URL http://www.oup.com/uk/catalogue/?ci=9780198510581

  16. [16]

    Kumar, Y

    A. Kumar, Y. Sabharwal, and S. Sen. A simple linear time ( 1 + epsiv;)-approximation algorithm for k-means clustering in any dimensions. In 45th Annual IEEE Symposium on Foundations of Computer Science , pages 454–462, Oct 2004. doi: 10.1109/FOCS.2004.7

  17. [17]

    Jing Lei and Kevin Z. Lin. Bias-adjusted spectral clust ering in multi-layer stochastic block models. Journal of the American Statistical Association , 118(544):2433–2445, 2023. doi: 10. 1080/01621459.2022.2054817

  18. [18]

    Consistency of spectr al clustering in stochastic block models

    Jing Lei and Alessandro Rinaldo. Consistency of spectr al clustering in stochastic block models. Ann. Statist. , 43(1):215–237, 2015. doi: 10.1214/14-AOS1274

  19. [19]

    Unified \ ell\_\ 2 rightarrow infty\ \ Eigenspace Perturbation Theory for Symmetric Random Matrices

    Lihua Lei. Unified ℓ2→∞ eigenspace perturbation theory for symmetric random matri ces. ArXiv: 1909.04798, 2020. URL https://arxiv.org/abs/1909.04798

  20. [20]

    Zhang, and Harrison H

    Matthias Loffler, Anderson Y. Zhang, and Harrison H. Zhou . Optimality of spectral clustering in the Gaussian mixture model. The Annals of Statistics , 49(5):2506 – 2530, 2021. doi: 10. 1214/20-AOS2044. URL https://doi.org/10.1214/20-AOS2044

  21. [21]

    Two provably consis- tent divide-and-conquer clustering algorithms for large n etworks

    Soumendu Sundar Mukherjee, Purnamrita Sarkar, and Pet er J Bickel. Two provably consis- tent divide-and-conquer clustering algorithms for large n etworks. Proceedings of the National Academy of Sciences , 118(44):e2100482118, 2021

  22. [22]

    Sharp optimal recovery in the two compo nent Gaussian mixture model

    Mohamed Ndaoud. Sharp optimal recovery in the two compo nent Gaussian mixture model. The Annals of Statistics , 50(4):2096 – 2126, 2022. doi: 10.1214/22-AOS2178. URL https://doi.org/10.1214/22-AOS2178

  23. [23]

    Signed diverse multiplex networks: C lustering and inference

    Marianna Pensky. Signed diverse multiplex networks: C lustering and inference. ArXiv:2204.12087, 2024

  24. [24]

    Clustering of diverse multiplex networks

    Marianna Pensky and Yaxuan Wang. Clustering of diverse multiplex networks. IEEE Transac- tions on Network Science and Engineering , 11(4):3441–3454, 2024. doi: 10.1109/TNSE.2024. 3374102

  25. [25]

    Spectral clus tering and the high-dimensional stochastic blockmodel

    Karl Rohe, Sourav Chatterjee, and Bin Yu. Spectral clus tering and the high-dimensional stochastic blockmodel. Ann. Statist. , 39(4):1878–1915, 2011

  26. [26]

    Adaptive clustering through semidefinit e programming

    Martin Royer. Adaptive clustering through semidefinit e programming. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. V ish- wanathan, and R. Garnett, editors, Advances in Neural Information Pro- cessing Systems 30 , pages 1795–1803. Curran Associates, Inc., 2017. URL http://papers.nips.cc/paper/6776-adaptive-clusterin g-through-semidefinit...

  27. [27]

    Patrick Rubin-Delanchy, Joshua Cape, Minh Tang, and Ca rey E. Priebe. A Statistical Inter- pretation of Spectral Embedding: The Generalised Random Do t Product Graph. Journal of the Royal Statistical Society Series B: Statistical Method ology, 84(4):1446–1473, 2022. ISSN 1369-7412. doi: 10.1111/rssb.12509. URL https://doi.org/10.1111/rssb.12509. 25

  28. [28]

    High-Dimensional Probability, volume 47 of Cambridge Series in Statistical and Probabilistic Mathematics

    Roman Vershynin. High-Dimensional Probability, volume 47 of Cambridge Series in Statistical and Probabilistic Mathematics . Cambridge University Press, 2018

  29. [30]

    URL https://arxiv.org/abs/2403.09170

  30. [31]

    Entrywise limit theorems for eigenvect ors of signal-plus-noise matrix mod- els with weak signals

    Fangzheng Xie. Entrywise limit theorems for eigenvect ors of signal-plus-noise matrix mod- els with weak signals. Bernoulli, 30(1):388 – 418, 2024. doi: 10.3150/23-BEJ1602. URL https://doi.org/10.3150/23-BEJ1602

  31. [32]

    Inference for heteroskedastic PCA with miss- ing data

    Yuling Yan, Yuxin Chen, and Jianqing Fan. Inference for heteroskedastic PCA with miss- ing data. The Annals of Statistics , 52(2):729 – 756, 2024. doi: 10.1214/24-AOS2366. URL https://doi.org/10.1214/24-AOS2366

  32. [33]

    Y. Yu, T. Wang, and R. J. Samworth. A useful variant of the davis-kahan theorem for statisti- cians. Biometrika, 102(2):315–323, 2014. ISSN 0006-3444. doi: 10.1093/biom et/asv008. URL https://doi.org/10.1093/biomet/asv008

  33. [34]

    Fundamental limits of spectral clus tering in stochastic block models

    Anderson Ye Zhang. Fundamental limits of spectral clus tering in stochastic block models. IEEE Transactions on Information Theory , 70(10):7320–7348, 2024. doi: 10.1109/TIT.2024.3425581

  34. [35]

    Limit results for distribu ted estimation of invariant subspaces in multiple networks inference and pca

    Runbing Zheng and Minh Tang. Limit results for distribu ted estimation of invariant subspaces in multiple networks inference and pca. ArXiv: 2206.04306 , 2022. doi: 10.48550/ARXIV.2206. 04306

  35. [36]

    Deflated HeteroPCA : Overcoming the curse of ill-conditioning in heteroskedastic PCA , March 2023

    Yuchen Zhou and Yuxin Chen. Deflated heteropca: Overcom ing the curse of ill-conditioning in heteroskedastic pca. ArXiv: 2303.06198 , 2024. URL https://arxiv.org/abs/2303.06198. 26 7 Proofs 7.1 Proofs of statements in Section 2. Proof of Theorem 2. Note that, by Weyl’s theorem, one has ˆλ r = λ r( ˆY ) ≥ λ r − ∥ E ∥, so that   ˆΛ− 1    = |ˆλ r|− 1 ≤ ...

  36. [37]

    Inequality (2.8) is the direct consequence of (2.6) and (2.7)

    (7.6) Finally, combining (7.2)–(7.6) and taking into account tha t ∆ 0 ≤ 1/ 4, obtain (2.7). Inequality (2.8) is the direct consequence of (2.6) and (2.7). 27 Proof of Theorem 3. Denote the sets, on which (2.6) and (2.10) are true, by, respe ctively, Ω τ,1 and Ω τ,1. Denote Ω τ = Ωτ,1 ∩ Ωτ,1 and observe that P(Ωτ ) ≥ 1 − 2 n− τ . Note that, due to (2.12) ...

  37. [38]

    leave one out

    Also, ∥ ˆU ∥2,∞ ≤ ∥ ˆU − U WU ∥2,∞ + ∥U ∥2,∞ . Combining all inequalities above and recalling that ǫ0 = o(1), immediately obtain (7.10). In order to prove (7.11), we use the “leave one out” method. Sp ecifically, fix l ∈ [n] and let ˆY (l) = ˆU (l) ˆΛ(l)( ˆU (l))T + ˆU (l) ⊥ ˆΛ(l) ⊥ ( ˆU (l) ⊥ )T be the SVD of ˆY (l), where ˆU (l) ∈ O n,r and ˆU (l) ⊥ ∈ O n...