Convergence Rates of Stochastic Zeroth-order Gradient Descent for ojasiewicz Functions
Abstract
We prove convergence rates of Stochastic Zeroth-order Gradient Descent (SZGD) algorithms for Lojasiewicz functions. The SZGD algorithm iterates as align* xt+1 = xt - ηt ∇ f (xt), t = 0,1,2,3,·s , align* where f is the objective function that satisfies the ojasiewicz inequality with ojasiewicz exponent θ, ηt is the step size (learning rate), and ∇ f (xt) is the approximate gradient estimated using zeroth-order information only. Our results show that \ f (xt) - f (x∞) \t ∈ N can converge faster than \ \| xt - x∞ \| \t ∈ N , regardless of whether the objective f is smooth or nonsmooth.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.