Online learning with kernel losses

Peter L. Bartlett

Online learning with kernel losses

Abstract

We present a generalization of the adversarial linear bandits framework, where the underlying losses are kernel functions (with an associated reproducing kernel Hilbert space) rather than linear functions. We study a version of the exponential weights algorithm and bound its regret in this setting. Under conditions on the eigendecay of the kernel we provide a sharp characterization of the regret for this algorithm. When we have polynomial eigendecay μj O(j-β), we find that the regret is bounded by Rn O(nβ/(2(β-1))); while under the assumption of exponential eigendecay μj O(e-β j ), we get an even tighter bound on the regret Rn O(n1/2(n)1/2). We also study the full information setting when the underlying losses are kernel functions and present an adapted exponential weights algorithm and a conditional gradient descent algorithm.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…