Almost Linear Constant-Factor Sketching for 1 and Logistic Regression
Abstract
We improve upon previous oblivious sketching and turnstile streaming results for 1 and logistic regression, giving a much smaller sketching dimension achieving O(1)-approximation and yielding an efficient optimization problem in the sketch space. Namely, we achieve for any constant c>0 a sketching dimension of O(d1+c) for 1 regression and O(μ d1+c) for logistic regression, where μ is a standard measure that captures the complexity of compressing the data. For 1-regression our sketching dimension is near-linear and improves previous work which either required ( d)-approximation with this sketching dimension, or required a larger poly(d) number of rows. Similarly, for logistic regression previous work had worse poly(μ d) factors in its sketching dimension. We also give a tradeoff that yields a 1+ approximation in input sparsity time by increasing the total size to (d(n)/)O(1/) for 1 and to (μ d(n)/)O(1/) for logistic regression. Finally, we show that our sketch can be extended to approximate a regularized version of logistic regression where the data-dependent regularizer corresponds to the variance of the individual logistic losses.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.