Optimistic Rates for Learning with a Smooth Loss
Abstract
We establish an excess risk bound of O(H Rn2 + Rn H L*) for empirical risk minimization with an H-smooth loss function and a hypothesis class with Rademacher complexity Rn, where L* is the best risk achievable by the hypothesis class. For typical hypothesis classes where Rn = R/n, this translates to a learning rate of O(RH/n) in the separable (L*=0) case and O(RH/n + L* RH/n) more generally. We also provide similar guarantees for online and stochastic convex optimization with a smooth non-negative objective.
0
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.