Closing the Gap on the Sample Complexity of 1-Identification
Abstract
The 1-identification problem is a fundamental pure-exploration problem in multi-armed bandits. An agent aims to determine whether there exists an arm whose mean reward exceeds a known threshold μ0, or to output None otherwise. The agent must guarantee correctness with probability at least 1-δ, while minimizing the expected number of arm pulls E[τ]. We study the 1-identification problem and make two main contributions. First, for instances with at least one qualified arm, we derive a new lower bound on E[τ] via a novel optimization formulation. Second, we propose a new algorithm and establish upper bounds that match the lower bounds up to polynomial logarithmic factors uniformly over all instances. Our result complements the analysis of Eτ when there are multiple qualified arms, which is an open problem in the literature.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.