Optimistic Rates for Learning with a Smooth Loss

Abstract

We establish an excess risk bound of O(H Rn2 + Rn H L*) for empirical risk minimization with an H-smooth loss function and a hypothesis class with Rademacher complexity Rn, where L* is the best risk achievable by the hypothesis class. For typical hypothesis classes where Rn = R/n, this translates to a learning rate of O(RH/n) in the separable (L*=0) case and O(RH/n + L* RH/n) more generally. We also provide similar guarantees for online and stochastic convex optimization with a smooth non-negative objective.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…