A Note on the Equivalence of Upper Confidence Bounds and Gittins Indices for Patient Agents

Daniel Russo

A Note on the Equivalence of Upper Confidence Bounds and Gittins Indices for Patient Agents

Abstract

This note gives a short, self-contained, proof of a sharp connection between Gittins indices and Bayesian upper confidence bound algorithms. I consider a Gaussian multi-armed bandit problem with discount factor γ. The Gittins index of an arm is shown to equal the γ-quantile of the posterior distribution of the arm's mean plus an error term that vanishes as γ 1. In this sense, for sufficiently patient agents, a Gittins index measures the highest plausible mean-reward of an arm in a manner equivalent to an upper confidence bound.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…