On the Global Convergence of Continuous-Time Stochastic Heavy-Ball Method for Nonconvex Optimization
Abstract
We study the convergence behavior of the stochastic heavy-ball method with a small stepsize. Under a change of time scale, we approximate the discrete method by a stochastic differential equation that models small random perturbations of a coupled system of nonlinear oscillators. We rigorously show that the perturbed system converges to a local minimum in a logarithmic time. This indicates that for the diffusion process that approximates the stochastic heavy-ball method, it takes (up to a logarithmic factor) only a linear time of the square root of the inverse stepsize to escape from all saddle points. This results may suggest a fast convergence of its discrete-time counterpart. Our theoretical results are validated by numerical experiments.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.