List Decodable Learning via Sum of Squares
Abstract
In the list-decodable learning setup, an overwhelming majority (say a 1-β-fraction) of the input data consists of outliers and the goal of an algorithm is to output a small list L of hypotheses such that one of them agrees with inliers. We develop a framework for list-decodable learning via the Sum-of-Squares SDP hierarchy and demonstrate it on two basic statistical estimation problems Linear regression: Suppose we are given labelled examples \(Xi,yi)\i ∈ [N] containing a subset S of β N inliers \Xi \i ∈ S that are drawn i.i.d. from standard Gaussian distribution N(0,I) in Rd, where the corresponding labels yi are well-approximated by a linear function . We devise an algorithm that outputs a list L of linear functions such that there exists some ∈ L that is close to . This yields the first algorithm for linear regression in a list-decodable setting. Our results hold for any distribution of examples whose concentration and anticoncentration can be certified by Sum-of-Squares proofs. Mean Estimation: Given data points \Xi\i ∈ [N] containing a subset S of β N inliers \Xi \i ∈ S that are drawn i.i.d. from a Gaussian distribution N(μ,I) in Rd, we devise an algorithm that generates a list L of means such that there exists μ ∈ L close to μ. The recovery guarantees of the algorithm are analogous to the existing algorithms for the problem by Diakonikolas ηl and Kothari ηl. In an independent and concurrent work, Karmalkar ηl KlivansKS19 also obtain an algorithm for list-decodable linear regression using the Sum-of-Squares SDP hierarchy.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.