Bandit Theory meets Compressed Sensing for high dimensional Stochastic Linear Bandit

Abstract

We consider a linear stochastic bandit problem where the dimension K of the unknown parameter θ is larger than the sampling budget n. In such cases, it is in general impossible to derive sub-linear regret bounds since usual linear bandit algorithms have a regret in O(Kn). In this paper we assume that θ is S-sparse, i.e. has at most S-non-zero components, and that the space of arms is the unit ball for the ||.||2 norm. We combine ideas from Compressed Sensing and Bandit Theory and derive algorithms with regret bounds in O(Sn).

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…