Natasha: Faster Non-Convex Stochastic Optimization Via Strongly Non-Convex Parameter
Abstract
Given a nonconvex function that is an average of n smooth functions, we design stochastic first-order methods to find its approximate stationary points. The convergence of our new methods depends on the smallest (negative) eigenvalue -σ of the Hessian, a parameter that describes how nonconvex the function is. Our methods outperform known results for a range of parameter σ, and can be used to find approximate local minima. Our result implies an interesting dichotomy: there exists a threshold σ0 so that the currently fastest methods for σ>σ0 and for σ<σ0 have different behaviors: the former scales with n2/3 and the latter scales with n3/4.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.