Faster Gradient-Free Algorithms for Nonsmooth Nonconvex Stochastic Optimization
Abstract
We consider the optimization problem of the form x ∈ Rd f(x) E [F(x; )], where the component F(x;) is L-mean-squared Lipschitz but possibly nonconvex and nonsmooth. The recently proposed gradient-free method requires at most O( L4 d3/2 ε-4 + L3 d3/2 δ-1 ε-4) stochastic zeroth-order oracle complexity to find a (δ,ε)-Goldstein stationary point of objective function, where = f(x0) - ∈fx ∈ Rd f(x) and x0 is the initial point of the algorithm. This paper proposes a more efficient algorithm using stochastic recursive gradient estimators, which improves the complexity to O(L3 d3/2 ε-3+ L2 d3/2 δ-1 ε-3).
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.