Optimal Local Convergence Rates of Stochastic First-Order Methods under Local α-PL
Abstract
We study the local convergence rate of stochastic first-order methods under a local α-Polyak-Lojasiewicz (α-PL) condition in a neighborhood of a target connected component M of the local minimizer set. The parameter α ∈ [1,2] is the exponent of the gradient norm in the α-PL inequality: α=2 recovers the classical PL case, α=1 corresponds to Holder-type error bounds, and intermediate values interpolate between these regimes. Our performance criterion is the number of oracle queries required to output x with F(x)-l , where l := F(y) for any y ∈ M. We work in a local regime where the algorithm is initialized near M and, with high probability, its iterates remain in that neighborhood. We establish a lower bound (-2/α) for all stochastic first-order methods in this regime, and we obtain a matching upper bound O(-2/α) for 1 α < 2 via a SARAH-type variance-reduced method with time-varying batch sizes and step sizes. In the convex setting, assuming a local α-PL condition on the -sublevel set, we further show a complexity lower bound (-2/α) for reaching an -global optimum, matching the -dependence of known accelerated stochastic subgradient methods.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.