Provable Approximations for Constrained p Regression

Abstract

The p linear regression problem is to minimize f(x)=||Ax-b||p over x∈Rd, where A∈Rn× d, b∈ Rn, and p>0. To avoid overfitting and bound ||x||2, the constrained p regression minimizes f(x) over every unit vector x∈Rd. This makes the problem non-convex even for the simplest case d=p=2. Instead, ridge regression is used to minimize the Lagrange form f(x)+λ ||x||2 over x∈Rd, which yields a convex problem in the price of calibrating the regularization parameter λ>0. We provide the first provable constant factor approximation algorithm that solves the constrained p regression directly, for every constant p,d≥ 1. Using core-sets, its running time is O(n n) including extensions for streaming and distributed (big) data. In polynomial time, it can handle outliers, p∈ (0,1) and minimize f(x) over every x and permutation of rows in A. Experimental results are also provided, including open source and comparison to existing software.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…