Optimal Spectral Recovery of a Planted Vector in a Subspace

Abstract

Recovering a planted vector v in an n-dimensional random subspace of RN is a generic task related to many problems in machine learning and statistics, such as dictionary learning, subspace recovery, principal component analysis, and non-Gaussian component analysis. In this work, we study computationally efficient estimation and detection of a planted vector v whose 4 norm differs from that of a Gaussian vector with the same 2 norm. For instance, in the special case where v is an N -sparse vector with Bernoulli-Gaussian or Bernoulli-Rademacher entries, our results include the following: (1) We give an improved analysis of a slight variant of the spectral method proposed by Hopkins, Schramm, Shi, and Steurer (2016), showing that it approximately recovers v with high probability in the regime n N. This condition subsumes the conditions 1/n or n N required by previous work up to polylogarithmic factors. We achieve ∞ error bounds for the spectral estimator via a leave-one-out analysis, from which it follows that a simple thresholding procedure exactly recovers v with Bernoulli-Rademacher entries, even in the dense case = 1. (2) We study the associated detection problem and show that in the regime n N, any spectral method from a large class (and more generally, any low-degree polynomial of the input) fails to detect the planted vector. This matches the condition for recovery and offers evidence that no polynomial-time algorithm can succeed in recovering a Bernoulli-Gaussian vector v when n N.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…