pith. sign in

arxiv: 2411.11728 · v2 · submitted 2024-11-18 · 📊 stat.ME

Davis-Kahan Theorem in the two-to-infinity norm and its application to perfect clustering

classification 📊 stat.ME
keywords boundsupperassumptionsdistancesnormtwo-to-infinityvariouswidehat
0
0 comments X
read the original abstract

Many statistical applications, such as the Principal Component Analysis, matrix completion, tensor regression and many others, rely on accurate estimation of leading eigenvectors of a matrix. The Davis-Kahan theorem is known to be instrumental for bounding above the distances between matrices $U$ and $\widehat{U}$ of population eigenvectors and their sample versions. While those distances can be measured in various metrics, the recent developments have shown advantages of evaluation of the deviation in the two-to-infinity norm. The purpose of this paper is to develop a toolbox for derivation of upper bounds for the distances between $U$ and $\widehat{U}$ in the two-to-infinity norm for a variety of possible scenarios. Although this problem has been studied by several authors, the difference between this paper and its predecessors is that the upper bounds are obtained under various sets of assumptions. The upper bounds are initially derived with no or mild probabilistic assumptions on the error, and are subsequently refined, when some generic probabilistic assumptions on the errors hold. The paper also provides rectification of the upper bounds in the cases of heavy-tailed or exponentially fast decaying errors. In addition, the paper suggests alternative methods for evaluation of $\widehat{U}$ and, therefore, enables one to compare the resulting accuracies. As an example of an application of the techniques in the paper, we derive sufficient conditions for perfect clustering in a generic setting, and then employ them in various scenarios.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Consistent line clustering using geometric hypergraphs

    math.ST 2025-05 unverdicted novelty 7.0

    Derives information-theoretic recovery thresholds for two intersecting lines with polynomial mass concentration near the intersection and matches them (up to polylog factors) via a spectral algorithm on a hypergraph b...