Online Learning: A Modern Introduction Using Convex Optimization
Abstract
In this book, I introduce the concepts of online learning through a modern view based on convex optimization. Here, online learning refers to the framework of regret minimization under worst-case assumptions. I attempted to unify all the literature as instantiations of Online Mirror Descent and Follow-the-Regularized-Leader (and their variants). I paid particular attention to the issue of tuning the parameters of the algorithms, through adaptive and parameter-free online learning algorithms. The bandit setting is also briefly discussed, touching on the problem of adversarial and stochastic multi-armed bandits. Building on fundamental algorithms and concepts, I also cover advanced topics, including black-box reductions, saddle-point optimization, sequential investment, and non-stationary forms of regret analysis. Finally, I conclude with a selection of applications of online learning to domains far from it, such as generalization theory and concentration inequalities. I attempted to maintain an informal, yet mathematically rigorous, tone throughout the book. Moreover, all the included proofs have been carefully chosen to be as simple and as short as possible. This also means that sometimes I have added one or two additional assumptions, just to simplify the proofs.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.