Robust Sparse Mean Estimation via Sum of Squares

Abstract

We study the problem of high-dimensional sparse mean estimation in the presence of an ε-fraction of adversarial outliers. Prior work obtained sample and computationally efficient algorithms for this task for identity-covariance subgaussian distributions. In this work, we develop the first efficient algorithms for robust sparse mean estimation without a priori knowledge of the covariance. For distributions on Rd with "certifiably bounded" t-th moments and sufficiently light tails, our algorithm achieves error of O(ε1-1/t) with sample complexity m = (k(d))O(t)/ε2-2/t. For the special case of the Gaussian distribution, our algorithm achieves near-optimal error of O(ε) with sample complexity m = O(k4 polylog(d))/ε2. Our algorithms follow the Sum-of-Squares based, proofs to algorithms approach. We complement our upper bounds with Statistical Query and low-degree polynomial testing lower bounds, providing evidence that the sample-time-error tradeoffs achieved by our algorithms are qualitatively the best possible.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…