Convergence Rates of Stochastic Zeroth-order Gradient Descent for \L ojasiewicz Functions

Yasong Feng

Convergence Rates of Stochastic Zeroth-order Gradient Descent for ojasiewicz Functions

Abstract

We prove convergence rates of Stochastic Zeroth-order Gradient Descent (SZGD) algorithms for Lojasiewicz functions. The SZGD algorithm iterates as align* xt+1 = xt - ηt ∇ f (xt), t = 0,1,2,3,·s , align* where f is the objective function that satisfies the ojasiewicz inequality with ojasiewicz exponent θ, ηt is the step size (learning rate), and ∇ f (xt) is the approximate gradient estimated using zeroth-order information only. Our results show that \ f (xt) - f (x∞) \t ∈ N can converge faster than \ \| xt - x∞ \| \t ∈ N , regardless of whether the objective f is smooth or nonsmooth.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…