Tight Query Complexity Lower Bounds for PCA via Finite Sample Deformed Wigner Law
Abstract
We prove a query complexity lower bound for approximating the top r dimensional eigenspace of a matrix. We consider an oracle model where, given a symmetric matrix M ∈ Rd × d, an algorithm Alg is allowed to make T exact queries of the form w(i) = M v(i) for i in \1,...,T\, where v(i) is drawn from a distribution which depends arbitrarily on the past queries and measurements \v(j),w(i)\1 j i-1. We show that for every gap ∈ (0,1/2], there exists a distribution over matrices M for which 1) gapr(M) = (gap) (where gapr(M) is the normalized gap between the r and r+1-st largest-magnitude eigenvector of M), and 2) any algorithm Alg which takes fewer than const × r dgap queries fails (with overwhelming probability) to identity a matrix V ∈ Rd × r with orthonormal columns for which V, M V (1 - const × gap)Σi=1r λi(M). Our bound requires only that d is a small polynomial in 1/gap and r, and matches the upper bounds of Musco and Musco '15. Moreover, it establishes a strict separation between convex optimization and randomized, "strict-saddle" non-convex optimization of which PCA is a canonical example: in the former, first-order methods can have dimension-free iteration complexity, whereas in PCA, the iteration complexity of gradient-based methods must necessarily grow with the dimension.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.