Relating Checkpoint Update Probabilities to Momentum Parameters in Single-Loop Variance Reduction Methods
Abstract
We propose a single-loop variance-reduced acceleration framework, which relates checkpoint update probabilities to momentum parameters, for solving the composite general convex problem where the smooth part has the finite-sum structure. Under the proposed framework, the growth rate of the momentum parameter is further altered, creating a novel continuous trade-off between acceleration and variance reduction, controlled by the key parameter α∈ [0,1]. A series of novel complexity is obtained, and some complexity of distinct known methods are rediscovered under the unified framework. When the mini-batch size is restricted due to the massive scale of the problem or the computational resource shortage, near-optimal complexity can still be achieved by choosing suitable α for any prefixed target accuracy. Analysis shows that although the considered gradient oracle is exact, acceleration comes with implicit price of heavier variance reduction, hence the obtained optimal α not necessarily corresponds to the largest allowable acceleration strength. Without prefixing the target accuracy, the proposed method achieves the near-optimal complexity O(n+n/ε) to obtain an ε-accurate solution under standard assumptions (n is the number of components of the finite-sum), significantly improves upon previous best complexity O(n+n/ε) of single-loop variance reduction methods, which does not exceed the complexity of the deterministic method FISTA. Numerical experiments demonstrate the efficiency of the proposed method and validate other theoretical findings.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.