Solving Empirical Risk Minimization in the Current Matrix Multiplication Time

Abstract

Many convex problems in machine learning and computer science share the same form: align* x Σi fi( Ai x + bi), align* where fi are convex functions on Rni with constant ni, Ai ∈ Rni × d, bi ∈ Rni and Σi ni = n. This problem generalizes linear programming and includes many problems in empirical risk minimization. In this paper, we give an algorithm that runs in time align* O* ( ( nω + n2.5 - α/2 + n2+ 1/6 ) (n / δ) ) align* where ω is the exponent of matrix multiplication, α is the dual exponent of matrix multiplication, and δ is the relative accuracy. Note that the runtime has only a log dependence on the condition numbers or other data dependent parameters and these are captured in δ. For the current bound ω 2.38 [Vassilevska Williams'12, Le Gall'14] and α 0.31 [Le Gall, Urrutia'18], our runtime O* ( nω (n / δ)) matches the current best for solving a dense least squares regression problem, a special case of the problem we consider. Very recently, [Alman'18] proved that all the current known techniques can not give a better ω below 2.168 which is larger than our 2+1/6. Our result generalizes the very recent result of solving linear programs in the current matrix multiplication time [Cohen, Lee, Song'19] to a more broad class of problems. Our algorithm proposes two concepts which are different from [Cohen, Lee, Song'19] : We give a robust deterministic central path method, whereas the previous one is a stochastic central path which updates weights by a random sparse vector. We propose an efficient data-structure to maintain the central path of interior point methods even when the weights update vector is dense.

0

Discussion (0)

Sign in to join the discussion.

Loading comments…