Subexponential-Time Algorithms for Sparse PCA
Abstract
We study the computational cost of recovering a unit-norm sparse principal component x ∈ Rn planted in a random matrix, in either the Wigner or Wishart spiked model (observing either W + λ xx with W drawn from the Gaussian orthogonal ensemble, or N independent samples from N(0, In + β xx), respectively). Prior work has shown that when the signal-to-noise ratio (λ or βN/n, respectively) is a small constant and the fraction of nonzero entries in the planted vector is \|x\|0 / n = , it is possible to recover x in polynomial time if 1/n. While it is possible to recover x in exponential time under the weaker condition 1, it is believed that polynomial-time recovery is impossible unless 1/n. We investigate the precise amount of time required for recovery in the "possible but hard" regime 1/n 1 by exploring the power of subexponential-time algorithms, i.e., algorithms running in time (nδ) for some constant δ ∈ (0,1). For any 1/n 1, we give a recovery algorithm with runtime roughly (2 n), demonstrating a smooth tradeoff between sparsity and runtime. Our family of algorithms interpolates smoothly between two existing algorithms: the polynomial-time diagonal thresholding algorithm and the ( n)-time exhaustive search algorithm. Furthermore, by analyzing the low-degree likelihood ratio, we give rigorous evidence suggesting that the tradeoff achieved by our algorithms is optimal.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.