Canonical correlation coefficients of high-dimensional normal vectors: finite rank case
Abstract
Consider a normal vector z=(x',y')', consisting of two sub-vectors x and y with dimensions p and q respectively. With n independent observations of z at hand, we study the correlation between x and y, from the perspective of the Canonical Correlation Analysis, under the high-dimensional setting: both p and q are proportional to the sample size n. In this paper, we focus on the case that xy is of finite rank k, i.e. there are k nonzero canonical correlation coefficients, whose squares are denoted by r1≥·s≥ rk>0. Under the additional assumptions (p+q)/n y∈ (0,1) and p/q 1, we study the sample counterparts of ri,i=1,…,k, i.e. the largest k eigenvalues of the sample canonical correlation matrix Sxx-1SxySyy-1Syx, namely λ1≥·s≥ λk. We show that there exists a threshold rc∈(0,1), such that for each i∈\1,…,k\, when ri≤ rc, λi converges almost surely to the right edge of the limiting spectral distribution of the sample canonical correlation matrix, denoted by dr. When ri>rc, λi possesses an almost sure limit in (dr,1], from which we can recover ri in turn, thus provide an estimate of the latter in the high-dimensional scenario.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.