The Sample Complexity of Simple Binary Hypothesis Testing

Po-Ling Loh

The Sample Complexity of Simple Binary Hypothesis Testing

Abstract

The sample complexity of simple binary hypothesis testing is the smallest number of i.i.d.\ samples required to distinguish between two distributions p and q in either: (i) the prior-free setting, with type-I error at most α and type-II error at most β; or (ii) the Bayesian setting, with Bayes error at most δ and prior distribution (π, 1-π). This problem has only been studied when α = β (prior-free) or π = 1/2 (Bayesian), and the sample complexity is known to be characterized by the Hellinger divergence between p and q, up to multiplicative constants. In this paper, we derive a formula that characterizes the sample complexity (up to multiplicative constants that are independent of p, q, and all error parameters) for: (i) all 0 α, β 1/8 in the prior-free setting; and (ii) all δ π/4 in the Bayesian setting. In particular, the formula admits equivalent expressions in terms of certain divergences from the Jensen--Shannon and Hellinger families. The main technical result concerns an f-divergence inequality between members of the Jensen--Shannon and Hellinger families, which is proved by a combination of information-theoretic tools and case-by-case analyses. We explore applications of our results to (i) robust hypothesis testing, (ii) distributed (locally-private and communication-constrained) hypothesis testing, (iii) sequential hypothesis testing, and (iv) hypothesis testing with erasures.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…