Low Rank Approximation and Regression in Input Sparsity Time
Abstract
We design a new distribution over (r -1) × n matrices S so that for any fixed n × d matrix A of rank r, with probability at least 9/10, SAx2 = (1 )Ax2 simultaneously for all x ∈ Rd. Such a matrix S is called a subspace embedding. Furthermore, SA can be computed in (A) + (d -1) time, where (A) is the number of non-zero entries of A. This improves over all previous subspace embeddings, which required at least (nd d) time to achieve this property. We call our matrices S sparse embedding matrices. Using our sparse embedding matrices, we obtain the fastest known algorithms for (1+)-approximation for overconstrained least-squares regression, low-rank approximation, approximating all leverage scores, and p-regression. The leading order term in the time complexity of our algorithms is O((A)) or O((A) n). We optimize the low-order (d/) terms in our running times (or for rank-k approximation, the n*(k/eps) term), and show various tradeoffs. For instance, we also use our methods to design new preconditioners that improve the dependence on in least squares regression to 1/. Finally, we provide preliminary experimental results which suggest that our algorithms are competitive in practice.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.