The Nesterov-Spokoiny Acceleration Achieves Strict o(1/k2) Convergence
Abstract
This paper studies the Nesterov-Spokoiny Acceleration (NSA), a variant of the accelerated gradient method by Nesterov and Spokoiny. For smooth convex optimization, NSA achieves a strict o(1/k2) convergence rate in function value and an o(1/(k3 k)) rate in squared gradient norm, while ensuring monotonic descent of the objective. We further study a zeroth-order version of NSA that handles inexact gradients, and extends NSA to composite optimization problems, in each case establishing o(1/k2) convergence in function value. A continuous-time analysis reveals connections to high-resolution ODEs known to underlie acceleration phenomena.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.