Analysis of Schedule-Free Nonconvex Optimization

Abstract

First-order methods underpin most large-scale learning algorithms, yet their classical convergence guarantees hinge on carefully scheduled step-sizes that depend on the total horizon T, which is rarely known in advance. The Schedule-Free (SF) method promises optimal performance with hyperparameters that are independent of T by interpolating between Polyak--Ruppert averaging and momentum, but nonconvex analysis of SF has been limited or reliant on strong global assumptions. We introduce a robust Lyapunov framework that, under only L-smoothness and lower-boundedness, reduces SF analysis to a single-step descent inequality. This yields horizon-agnostic bounds in the nonconvex setting: O(1/ T) for constant step + PR averaging, O( T/T) for a linearly growing step-size, and a continuum of O(T-(1-α)) rates for polynomial averaging. We complement these proofs with Performance Estimation Problem (PEP) experiments that numerically validate our rates and suggest that our O(1/ T) bound on the original nonconvex SF algorithm may tighten to O(1/T). Our work extends SF's horizon-free guarantees to smooth nonconvex optimization and charts future directions for optimal nonconvex rates.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…