Testing probability distributions using conditional samples

Abstract

We study a new framework for property testing of probability distributions, by considering distribution testing algorithms that have access to a conditional sampling oracle.* This is an oracle that takes as input a subset S ⊂eq [N] of the domain [N] of the unknown probability distribution D and returns a draw from the conditional probability distribution D restricted to S. This new model allows considerable flexibility in the design of distribution testing algorithms; in particular, testing algorithms in this model can be adaptive. We study a wide range of natural distribution testing problems in this new framework and some of its variants, giving both upper and lower bounds on query complexity. These problems include testing whether D is the uniform distribution U; testing whether D = D for an explicitly provided D; testing whether two unknown distributions D1 and D2 are equivalent; and estimating the variation distance between D and the uniform distribution. At a high level our main finding is that the new "conditional sampling" framework we consider is a powerful one: while all the problems mentioned above have (N) sample complexity in the standard model (and in some cases the complexity must be almost linear in N), we give poly( N, 1/)-query algorithms (and in some cases poly(1/)-query algorithms independent of N) for all these problems in our conditional sampling setting. *Independently from our work, Chakraborty et al. also considered this framework. We discuss their work in Subsection [1.4].

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…