Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion
Abstract
We present new algorithms for optimizing non-smooth, non-convex stochastic objectives based on a novel analysis technique. This improves the current best-known complexity for finding a (δ,ε)-stationary point from O(ε-4δ-1) stochastic gradient queries to O(ε-3δ-1), which we also show to be optimal. Our primary technique is a reduction from non-smooth non-convex optimization to online learning, after which our results follow from standard regret bounds in online learning. For deterministic and second-order smooth objectives, applying more advanced optimistic online learning techniques enables a new complexity of O(ε-1.5δ-0.5). Our techniques also recover all optimal or best-known results for finding ε stationary points of smooth or second-order smooth objectives in both stochastic and deterministic settings.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.