Efficient Continual Finite-Sum Minimization
Abstract
Given a sequence of functions f1,…,fn with fi:D R, finite-sum minimization seeks a point x ∈ D minimizing Σj=1n fj(x)/n. In this work, we propose a key twist into the finite-sum minimization, dubbed as continual finite-sum minimization, that asks for a sequence of points x1,…,xn ∈ D such that each xi ∈ D minimizes the prefix-sum Σj=1ifj(x)/i. Assuming that each prefix-sum is strongly convex, we develop a first-order continual stochastic variance reduction gradient method (CSVRG) producing an ε-optimal sequence with O(n/ε1/3 + 1/ε) overall first-order oracles (FO). An FO corresponds to the computation of a single gradient ∇ fj(x) at a given x ∈ D for some j ∈ [n]. Our approach significantly improves upon the O(n/ε) FOs that StochasticGradientDescent requires and the O(n2 (1/ε)) FOs that state-of-the-art variance reduction methods such as Katyusha require. We also prove that there is no natural first-order method with O(n/εα) gradient complexity for α < 1/4, establishing that the first-order complexity of our method is nearly tight.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.