Natasha: Faster Non-Convex Stochastic Optimization Via Strongly Non-Convex Parameter

Abstract

Given a nonconvex function that is an average of n smooth functions, we design stochastic first-order methods to find its approximate stationary points. The convergence of our new methods depends on the smallest (negative) eigenvalue -σ of the Hessian, a parameter that describes how nonconvex the function is. Our methods outperform known results for a range of parameter σ, and can be used to find approximate local minima. Our result implies an interesting dichotomy: there exists a threshold σ0 so that the currently fastest methods for σ>σ0 and for σ<σ0 have different behaviors: the former scales with n2/3 and the latter scales with n3/4.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…