Improved Regret Bounds for Online Kernel Selection under Bandit Feedback

Abstract

In this paper, we improve the regret bound for online kernel selection under bandit feedback. Previous algorithm enjoys a O(( f2Hi+1)K13T23) expected bound for Lipschitz loss functions. We prove two types of regret bounds improving the previous bound. For smooth loss functions, we propose an algorithm with a O(U23K-13(ΣKi=1LT(fi))23) expected bound where LT(fi) is the cumulative losses of optimal hypothesis in Hi=\f∈Hi: fHi≤ U\. The data-dependent bound keeps the previous worst-case bound and is smaller if most of candidate kernels match well with the data. For Lipschitz loss functions, we propose an algorithm with a O(UKT23T) expected bound asymptotically improving the previous bound. We apply the two algorithms to online kernel selection with time constraint and prove new regret bounds matching or improving the previous O(TK + f2Hi\T,TR\) expected bound where R is the time budget. Finally, we empirically verify our algorithms on online regression and classification tasks.

0

Discussion (0)

Sign in to join the discussion.

Loading comments…