Complexity Theoretic Limitations on Learning Halfspaces
Abstract
We study the problem of agnostically learning halfspaces which is defined by a fixed but unknown distribution D on Qn× \ 1\. We define ErrHALF(D) as the least error of a halfspace classifier for D. A learner who can access D has to return a hypothesis whose error is small compared to ErrHALF(D). Using the recently developed method of the author, Linial and Shalev-Shwartz we prove hardness of learning results under a natural assumption on the complexity of refuting random K-XOR formulas. We show that no efficient learning algorithm has non-trivial worst-case performance even under the guarantees that ErrHALF(D) η for arbitrarily small constant η>0, and that D is supported in \ 1\n× \ 1\. Namely, even under these favorable conditions its error must be 12-1nc for every c>0. In particular, no efficient algorithm can achieve a constant approximation ratio. Under a stronger version of the assumption (where K can be poly-logarithmic in n), we can take η = 2-1-(n) for arbitrarily small >0. Interestingly, this is even stronger than the best known lower bounds (Arora et. al. 1993, Feldamn et. al. 2006, Guruswami and Raghavendra 2006) for the case that the learner is restricted to return a halfspace classifier (i.e. proper learning).
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.