On the principal components of sample covariance matrices
Abstract
We introduce a class of M × M sample covariance matrices Q which subsumes and generalizes several previous models. The associated population covariance matrix = E Q is assumed to differ from the identity by a matrix of bounded rank. All quantities except the rank of - IM may depend on M in an arbitrary fashion. We investigate the principal components, i.e.\ the top eigenvalues and eigenvectors, of Q. We derive precise large deviation estimates on the generalized components w, i of the outlier and non-outlier eigenvectors i. Our results also hold near the so-called BBP transition, where outliers are created or annihilated, and for degenerate or near-degenerate outliers. We believe the obtained rates of convergence to be optimal. In addition, we derive the asymptotic distribution of the generalized components of the non-outlier eigenvectors. A novel observation arising from our results is that, unlike the eigenvalues, the eigenvectors of the principal components contain information about the subcritical spikes of . The proofs use several results on the eigenvalues and eigenvectors of the uncorrelated matrix Q, satisfying E Q = IM, as input: the isotropic local Marchenko-Pastur law established in [9], level repulsion, and quantum unique ergodicity of the eigenvectors. The latter is a special case of a new universality result for the joint eigenvalue-eigenvector distribution.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.