The Sample Complexity of Multiclass and Sparse Contextual Bandits

Alexander Rakhlin

The Sample Complexity of Multiclass and Sparse Contextual Bandits

Abstract

We study contextual bandits in the stochastic i.i.d.\ setting, where a learner observes contexts drawn from an unknown distribution, selects actions from a finite set A, and aims to identify an approximately optimal policy from a given class based on bandit feedback. Motivated by bandit multiclass classification with zero-one rewards, we focus on the s-sparse setting in which, for every context, the reward vector has L1-norm at most s |A|. Our main result is the design of algorithms that, with high probability, output an ε-optimal policy compared to policy class Π using O ((s/ε2 + |A|/ε) |Π|/δ) samples. We extend this bound to general Natarajan classes and complement it with a matching lower bound (up to logarithmic factors), thereby closing a substantial gap left by prior work (Erez et al., 2024, 2025), which incurred an additional Θ(|A|9) dependence. We obtain these results via two complementary approaches. First, we analyze contextual bandits through the lens of contextual decision making with structured observations, designing an exploration-by-optimization algorithm whose sample complexity is governed by the decision-estimation coefficient (DEC; Foster et al., 2021, 2022). We show that, with s-sparse rewards, the induced model class admits a sharp DEC bound that scales with s and directly yields the optimal rate. Since this approach is largely information-theoretic and involves solving complex min-max optimization problems, we also develop a second, more specialized algorithmic method based on a low-variance exploration technique. This approach leads to concrete, tractable algorithms and naturally extends to contextual combinatorial semi-bandits, leading to improved sample complexity guarantees for bandit multiclass list classification.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…