Testing k-Modal Distributions: Optimal Algorithms via Reductions

Abstract

We give highly efficient algorithms, and almost matching lower bounds, for a range of basic statistical problems that involve testing and estimating the L1 distance between two k-modal distributions p and q over the discrete domain \1,…,n\. More precisely, we consider the following four problems: given sample access to an unknown k-modal distribution p, Testing identity to a known or unknown distribution: 1. Determine whether p = q (for an explicitly given k-modal distribution q) versus p is -far from q; 2. Determine whether p=q (where q is available via sample access) versus p is -far from q; Estimating L1 distance ("tolerant testing'') against a known or unknown distribution: 3. Approximate dTV(p,q) to within additive where q is an explicitly given k-modal distribution q; 4. Approximate dTV(p,q) to within additive where q is available via sample access. For each of these four problems we give sub-logarithmic sample algorithms, that we show are tight up to additive (k) and multiplicative n+ k factors. Thus our bounds significantly improve the previous results of BKR:04, which were for testing identity of distributions (items (1) and (2) above) in the special cases k=0 (monotone distributions) and k=1 (unimodal distributions) and required O(( n)3) samples. As our main conceptual contribution, we introduce a new reduction-based approach for distribution-testing problems that lets us obtain all the above results in a unified way. Roughly speaking, this approach enables us to transform various distribution testing problems for k-modal distributions over \1,…,n\ to the corresponding distribution testing problems for unrestricted distributions over a much smaller domain \1,…,\ where = O(k n).

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…