An Asymptotic Law of the Iterated Logarithm for KL∈f
Abstract
The population KL∈f is a fundamental quantity that appears in lower bounds for (asymptotically) optimal regret of pure-exploration stochastic bandit algorithms, and optimal stopping time of sequential tests. Motivated by this, an empirical KL∈f statistic is frequently used in the design of (asymptotically) optimal bandit algorithms and sequential tests. While nonasymptotic concentration bounds for the empirical KL∈f have been developed, their optimality in terms of constants and rates is questionable, and their generality is limited (usually to bounded observations). The fundamental limits of nonasymptotic concentration are often described by the asymptotic fluctuations of the statistics. With that motivation, this paper presents a tight (upper and lower) law of the iterated logarithm for empirical KL∈f applying to extremely general (unbounded) data.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.