Variance Reduction for Matrix Games
Abstract
We present a randomized primal-dual algorithm that solves the problem x y y A x to additive error ε in time nnz(A) + nnz(A)n/ε, for matrix A with larger dimension n and nnz(A) nonzero entries. This improves the best known exact gradient methods by a factor of nnz(A)/n and is faster than fully stochastic gradient methods in the accurate and/or sparse regime ε n/nnz(A). Our results hold for x,y in the simplex (matrix games, linear programming) and for x in an 2 ball and y in the simplex (perceptron / SVM, minimum enclosing ball). Our algorithm combines Nemirovski's "conceptual prox-method" and a novel reduced-variance gradient estimator based on "sampling from the difference" between the current iterate and a reference point.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.