Asymptotic joint distribution of extreme eigenvalues and trace of large sample covariance matrix in a generalized spiked population model
Pith reviewed 2026-05-25 17:32 UTC · model grok-4.3
The pith
The joint limiting distribution of extreme eigenvalues and trace is derived for the generalized spiked population model with proportional growth.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In the generalized spiked population model with a fixed number of spikes, under the proportional growth regime where dimension p and sample size n both tend to infinity with p/n approaching a constant, the suitably centered and scaled vector of extreme eigenvalues together with the trace converges in distribution to a multivariate normal limit whose covariance structure is explicitly determined from the model parameters.
What carries the argument
The joint asymptotic behavior of two classes of spectral processes, one for the extreme eigenvalues and one for the linear spectral statistics (trace).
If this is right
- The joint distribution supplies the covariance terms needed for higher-order bias correction in Johnson-Graybill-type tests.
- The tests for signals become more accurate in finite samples once the dependence between the extreme eigenvalues and the trace is accounted for.
- The same joint convergence applies to any fixed number of the largest eigenvalues together with the trace.
- The approach extends classical marginal limit results by treating the extreme and trace statistics simultaneously.
Where Pith is reading between the lines
- The same technique could be used to obtain joint limits involving the trace and other smooth functionals of the spectrum.
- In factor models or PCA applications the corrected critical values might reduce over- or under-detection of signals when p and n are comparable.
- Numerical verification of the convergence rate would require generating data exactly under the spiked model and checking the empirical joint distribution against the theoretical normal.
Load-bearing premise
The data exactly follow the generalized spiked population model with a fixed number of spikes and the dimension and sample size grow proportionally as stated.
What would settle it
Monte Carlo simulations drawn from the generalized spiked model with known parameters would show that the normalized extreme eigenvalues and trace fail to jointly approach the predicted multivariate normal distribution.
read the original abstract
This paper studies the joint limiting behavior of extreme eigenvalues and trace of large sample covariance matrix in a generalized spiked population model, where the asymptotic regime is such that the dimension and sample size grow proportionally. The form of the joint limiting distribution is applied to conduct Johnson-Graybill-type tests, a family of approaches testing for signals in a statistical model. For this, higher order correction is further made, helping alleviate the impact of finite-sample bias. The proof rests on determining the joint asymptotic behavior of two classes of spectral processes, corresponding to the extreme and linear spectral statistics respectively.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper derives the joint limiting distribution of the extreme eigenvalues and the trace of the sample covariance matrix under a generalized spiked population model in the proportional growth regime (p/n → γ). The derivation proceeds via the joint asymptotics of extreme spectral processes and linear spectral statistics. The limiting joint law is then applied to Johnson-Graybill-type tests for the presence of signals, with an additional higher-order correction term introduced to mitigate finite-sample bias.
Significance. If the joint limiting result holds, the work supplies a technically useful extension of the spiked-model literature by coupling extreme-eigenvalue and trace statistics, which directly improves the calibration of signal-detection procedures. The explicit higher-order bias correction is a practical contribution that addresses a common limitation of first-order asymptotic approximations in moderate-dimensional settings.
minor comments (3)
- The abstract states that the proof rests on joint asymptotics of two classes of spectral processes but supplies no outline of the key steps or error bounds; a brief roadmap in §2 or §3 would help readers verify that the stated joint limit follows from the model assumptions.
- Notation for the generalized spiked model (population eigenvalues, spike locations, and the limiting ratio γ) should be collected in a single display early in the paper to avoid repeated re-definition.
- In the application section, the higher-order correction term is introduced without an explicit statement of the order of the remainder; adding this would clarify the improvement over the first-order approximation.
Simulated Author's Rebuttal
We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No major comments appear in the report, so there are no specific points requiring point-by-point rebuttal. We will incorporate any minor editorial or presentational improvements in the revised version.
Circularity Check
No significant circularity; derivation is self-contained asymptotic analysis
full rationale
The paper derives the joint limiting distribution of extreme eigenvalues and trace under the generalized spiked population model via joint asymptotics of extreme and linear spectral processes in the proportional regime. This is a standard first-principles random matrix theory argument with no reduction of any claimed result to fitted parameters, self-definitions, or load-bearing self-citations. The model assumptions and proof strategy are externally consistent with the literature on spiked covariance models; the result does not reduce to its inputs by construction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Bai, Z., Ding, X. (2012). Estimation of spiked eigenvalues in spiked models. Random Matrices: Theory and Applications , 1(2), 1150011
work page 2012
-
[2]
Bai, Z., Silverstein, J. (2004). CLT for linear spectral statistics of large-dimensional sample covariance matrices. The Annals of Probability , 32(1A), 553-605
work page 2004
-
[3]
Bai, Z., Yao, J. (2008). Central limit theorems for eigenvalues in a spiked population model. Annales de l'Institut Henri Poincar\'e, Probabilit\'es et Statistiques , 44(3), 447-474
work page 2008
-
[4]
Bai, Z., Yao, J. (2012). On sample eigenvalues in a generalized spiked population model. Journal of Multivariate Analysis , 106, 167-177
work page 2012
-
[5]
Baik, J., Silverstein, J. W. (2006). Eigenvalues of large sample covariance matrices of spiked population models. Journal of Multivariate Analysis , 97(6), 1382-1408
work page 2006
-
[6]
Baik, J., Arous, G., P\' e ch\' e , S. (2005). Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. The Annals of Probability , 33(5), 1643-1697
work page 2005
-
[7]
Bhattacharjee, M., and Bose, A. (2016). Large sample behaviour of high dimensional autocovariance matrices. The Annals of Statistics , 44(2), 598-628
work page 2016
-
[8]
Bianchi, P., Debbah, M., Maida, M., Najim, J. (2011). Performance of statistical tests for single-source detection using random matrix theory. IEEE Transactions on Information Theory , 57(4), 2400-2419
work page 2011
-
[9]
Chen, B., Pan, G. (2015) CLT for linear spectral statistics of normalized sample covariance matrices with the dimension much larger than the sample size. Bernoulli , 21(2), 1089-1133
work page 2015
-
[10]
Choi, Y., Taylor, J., and Tibshirani, R. (2017). Selecting the number of principal components: Estimation of the true rank of a noisy matrix. The Annals of Statistics , 45(6), 2590-2617
work page 2017
-
[11]
Chow, T., Teugels, J. (1978). The sum and the maximum of iid random variables. In Proceedings of the 2nd Prague Symposium on Asymptotic Statistics , 81-92
work page 1978
-
[12]
A., Heiny, J., Mikosch, T., and Xie, X
Davis, R. A., Heiny, J., Mikosch, T., and Xie, X. (2016). Extreme value analysis for the sample autocovariance matrices of heavy-tailed multivariate time series. Extremes , 19(3), 517-547
work page 2016
-
[13]
Deo, R. (2016). On the Tracy–Widom approximation of studentized extreme eigenvalues of Wishart matrices. Journal of Multivariate Analysis , 147, 265-272
work page 2016
-
[14]
Hsing, T. (1995). A note on the asymptotic independence of the sum and maximum of strongly mixing stationary random variables. The Annals of Probability , 23(2), 938-947
work page 1995
-
[15]
Johnson, D., Graybill, F. (1972). An analysis of a two-way model with interaction and no replication. Journal of the American Statistical Association , 67(340), 862-868
work page 1972
-
[16]
Johnstone, I. (2001). On the distribution of the largest eigenvalue in principal components analysis. The Annals of Statistics , 29(2), 295-327
work page 2001
-
[17]
Knowles, A., Yin, J. (2017). Anisotropic local laws for random matrices. Probability Theory and Related Fields , 169(1), 257-352
work page 2017
-
[18]
Kritchman, S., Nadler, B. (2008). Determining the number of components in a factor model from limited noisy data. Chemometrics and Intelligent Laboratory Systems , 94(1), 19-32
work page 2008
-
[19]
Ma, Z. (2012). Accuracy of the Tracy-Widom limits for the extreme eigenvalues in white Wishart matrices. Bernoulli , 18(1), 322-359
work page 2012
-
[20]
Nadler, B. (2011). On the distribution of the ratio of the largest eigenvalue to the trace of a Wishart matrix. Journal of Multivariate Analysis , 102(2), 363-371
work page 2011
-
[21]
Onatski, A., Moreira, M. J., and Hallin, M. (2013). Asymptotic power of sphericity tests for high-dimensional data. The Annals of Statistics , 41(3), 1204-1231
work page 2013
-
[22]
Paul, D. (2007). Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statistica Sinica , 17(4), 1617-1642
work page 2007
-
[23]
Paul, D., Aue, A. (2014). Random matrix theory in statistics: A review. Journal of Statistical Planning and Inference , 150, 1-29
work page 2014
-
[24]
Silverstein, J., Choi, S. (1995). Analysis of the limiting spectral distribution of large dimensional random matrices. Journal of Multivariate Analysis , 54(2), 295-309
work page 1995
-
[25]
Wang, W., Fan, J. (2017). Asymptotics of empirical eigenstructure for high dimensional spiked covariance model. The Annals of Statistics , 45(3), 1342-1374
work page 2017
-
[26]
Wang, Q., Silverstein, J. W., Yao, J. (2014a). A note on the CLT of the LSS for sample covariance matrix from a spiked population model. Journal of Multivariate Analysis , 130, 194-207
-
[27]
Wang, Q., Su, Z., Yao, J. (2014b). Joint CLT for several random sesquilinear forms with applications to large-dimensional spiked population models. Electronic Journal of Probability , 19(103), 1-28
-
[28]
Yao, J., Zheng, S., Bai. Z. (2015). Large Sample Covariance Matrices and High-Dimensional Data Analysis. Cambridge University Press
work page 2015
-
[29]
Zheng, S. (2012). Central limit theorems for linear spectral statistics of large dimensional F-matrices. Annales de l'Institut Henri Poincar\'e, Probabilit\'es et Statistiques , 48(2), 444-476
work page 2012
-
[30]
Zheng, S., Bai, Z., Yao, J. (2015). Substitution principle for CLT of linear spectral statistics of high-dimensional sample covariance matrices with applications to hypothesis testing. The Annals of Statistics , 43(2), 546-591
work page 2015
-
[31]
Zheng, S., Bai, Z., Yao, J., Zhu, H. (2016). CLT for linear spectral statistics of large dimensional sample covariance matrices with dependent data. preprint arXiv:1708.03749
work page internal anchor Pith review Pith/arXiv arXiv 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.