List-Decodable Sparse Mean Estimation via Difference-of-Pairs Filtering

Abstract

We study the problem of list-decodable sparse mean estimation. Specifically, for a parameter α ∈ (0, 1/2), we are given m points in Rn, α m of which are i.i.d. samples from a distribution D with unknown k-sparse mean μ. No assumptions are made on the remaining points, which form the majority of the dataset. The goal is to return a small list of candidates containing a vector μ such that \| μ - μ \|2 is small. Prior work had studied the problem of list-decodable mean estimation in the dense setting. In this work, we develop a novel, conceptually simpler technique for list-decodable mean estimation. As the main application of our approach, we provide the first sample and computationally efficient algorithm for list-decodable sparse mean estimation. In particular, for distributions with "certifiably bounded" t-th moments in k-sparse directions and sufficiently light tails, our algorithm achieves error of (1/α)O(1/t) with sample complexity m = (k(n))O(t)/α and running time poly(mnt). For the special case of Gaussian inliers, our algorithm achieves the optimal error guarantee of ((1/α)) with quasi-polynomial sample and computational complexity. We complement our upper bounds with nearly-matching statistical query and low-degree polynomial testing lower bounds.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…