Gradient Descent Finds the Cubic-Regularized Non-Convex Newton Step

Abstract

We consider the minimization of non-convex quadratic forms regularized by a cubic term, which exhibit multiple saddle points and poor local minima. Nonetheless, we prove that, under mild assumptions, gradient descent approximates the global minimum to within accuracy in O(-1(1/)) steps for large and O((1/)) steps for small (compared to a condition number we define), with at most logarithmic dependence on the problem dimension. When we use gradient descent to approximate the cubic-regularized Newton step, our result implies a rate of convergence to second-order stationary points of general smooth non-convex functions.

0

Discussion (0)

Sign in to join the discussion.

Loading comments…