Fast Regression with an ∞ Guarantee

Abstract

Sketching has emerged as a powerful technique for speeding up problems in numerical linear algebra, such as regression. In the overconstrained regression problem, one is given an n × d matrix A, with n d, as well as an n × 1 vector b, and one wants to find a vector x so as to minimize the residual error \|Ax-b\|2. Using the sketch and solve paradigm, one first computes S · A and S · b for a randomly chosen matrix S, then outputs x' = (SA) Sb so as to minimize \|SAx' - Sb\|2. The sketch-and-solve paradigm gives a bound on \|x'-x*\|2 when A is well-conditioned. Our main result is that, when S is the subsampled randomized Fourier/Hadamard transform, the error x' - x* behaves as if it lies in a "random" direction within this bound: for any fixed direction a∈ Rd, we have with 1 - d-c probability that \[ a, x'-x* \|a\|2\|x'-x*\|2d12-γ, (1) \] where c, γ > 0 are arbitrary constants. This implies \|x'-x*\|∞ is a factor d12-γ smaller than \|x'-x*\|2. It also gives a better bound on the generalization of x' to new examples: if rows of A correspond to examples and columns to features, then our result gives a better bound for the error introduced by sketch-and-solve when classifying fresh examples. We show that not all oblivious subspace embeddings S satisfy these properties. In particular, we give counterexamples showing that matrices based on Count-Sketch or leverage score sampling do not satisfy these properties. We also provide lower bounds, both on how small \|x'-x*\|2 can be, and for our new guarantee (1), showing that the subsampled randomized Fourier/Hadamard transform is nearly optimal.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…