Escape saddle points by a simple gradient-descent based algorithm

Tongyang Li

Escape saddle points by a simple gradient-descent based algorithm

Abstract

Escaping saddle points is a central research topic in nonconvex optimization. In this paper, we propose a simple gradient-based algorithm such that for a smooth function fn, it outputs an ε-approximate second-order stationary point in O( n/ε1.75) iterations. Compared to the previous state-of-the-art algorithms by Jin et al. with O(( n)4/ε2) or O(( n)6/ε1.75) iterations, our algorithm is polynomially better in terms of n and matches their complexities in terms of 1/ε. For the stochastic setting, our algorithm outputs an ε-approximate second-order stationary point in O(( n)2/ε4) iterations. Technically, our main contribution is an idea of implementing a robust Hessian power method using only gradients, which can find negative curvature near saddle points and achieve the polynomial speedup in n compared to the perturbed gradient descent methods. Finally, we also perform numerical experiments that support our results.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…