On Accelerated Methods in Optimization

Abstract

In convex optimization, there is an acceleration phenomenon in which we can boost the convergence rate of certain gradient-based algorithms. We can observe this phenomenon in Nesterov's accelerated gradient descent, accelerated mirror descent, and accelerated cubic-regularized Newton's method, among others. In this paper, we show that the family of higher-order gradient methods in discrete time (generalizing gradient descent) corresponds to a family of first-order rescaled gradient flows in continuous time. On the other hand, the family of accelerated higher-order gradient methods (generalizing accelerated mirror descent) corresponds to a family of second-order differential equations in continuous time, each of which is the Euler-Lagrange equation of a family of Lagrangian functionals. We also study the exponential variant of the Nesterov Lagrangian, which corresponds to a generalization of Nesterov's restart scheme and achieves a linear rate of convergence in discrete time. Finally, we show that the family of Lagrangians is closed under time dilation (an orbit under the action of speeding up time), which demonstrates the universality of this Lagrangian view of acceleration in optimization.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…