Sharp analysis of linear ensemble sampling

Csaba Szepesvári

Sharp analysis of linear ensemble sampling

Abstract

We analyse linear ensemble sampling (ES) with standard Gaussian perturbations in stochastic linear bandits. We show that for ensemble size m=Θ(d n), ES attains O(d3/2 n) high-probability regret, closing the gap to the Thompson sampling benchmark while keeping computation comparable. The proof brings a new perspective on randomized exploration in linear bandits by reducing the analysis to a time-uniform exceedance problem for m independent Brownian motions. This continuous-time lens appears particularly natural here: it yields an exact representation of the relevant discrete-time processes, and we do not know another route to a sharp ES bound.

0

Discussion (0)

Sign in to join the discussion.

Loading comments…