Sharp analysis of linear ensemble sampling
Abstract
We analyse linear ensemble sampling (ES) with standard Gaussian perturbations in stochastic linear bandits. We show that for ensemble size m=Θ(d n), ES attains O(d3/2 n) high-probability regret, closing the gap to the Thompson sampling benchmark while keeping computation comparable. The proof brings a new perspective on randomized exploration in linear bandits by reducing the analysis to a time-uniform exceedance problem for m independent Brownian motions. This continuous-time lens appears particularly natural here: it yields an exact representation of the relevant discrete-time processes, and we do not know another route to a sharp ES bound.
0