Sample canonical correlation coefficients of high-dimensional random vectors: local law and Tracy-Widom limit

Abstract

Consider two random vectors C11/2 x ∈ Rp and C21/2 y∈ Rq, where the entries of x and y are i.i.d. random variables with mean zero and variance one, and C1 and C2 are p × p and q× q deterministic population covariance matrices. With n independent samples of ( C11/2 x, C21/2 y), we study the sample correlation between these two vectors using canonical correlation analysis. We denote by Sxx and Syy the sample covariance matrices for C11/2 x and C21/2 y, respectively, and Sxy the sample cross-covariance matrix. Then the sample canonical correlation coefficients are the square roots of the eigenvalues of the sample canonical correlation matrix CXY:=Sxx-1SxySyy-1Syx. Under the high-dimensional setting with p/n c1 ∈ (0, 1) and q/n c2 ∈ (0, 1-c1) as n ∞, we prove that the largest eigenvalue of CXY converges to the Tracy-Widom distribution as long as we have s → ∞s4 [P( xij ≥ s)+ P( yij ≥ s)]=0. This extends the result in [16], which established the Tracy-Widom limit of the largest eigenvalue of CXY under the assumption that all moments are finite. Our proof is based on a linearization method, which reduces the problem to the study of a (p+q+2n)× (p+q+2n) random matrix H. In particular, we shall prove an optimal local law on its inverse G:=H-1, i.e the resolvent. This local law is the main tool for both the proof of the Tracy-Widom law in this paper, and the study in [22,23] on the canonical correlation coefficients of high-dimensional random vectors with finite rank correlations.

0

Discussion (0)

Sign in to join the discussion.

Loading comments…