Properly Learning Poisson Binomial Distributions in Almost Polynomial Time

Abstract

We give an algorithm for properly learning Poisson binomial distributions. A Poisson binomial distribution (PBD) of order n is the discrete probability distribution of the sum of n mutually independent Bernoulli random variables. Given O(1/ε2) samples from an unknown PBD p, our algorithm runs in time (1/ε)O( (1/ε)), and outputs a hypothesis PBD that is ε-close to p in total variation distance. The previously best known running time for properly learning PBDs was (1/ε)O((1/ε)). As one of our main contributions, we provide a novel structural characterization of PBDs. We prove that, for all ε >0, there exists an explicit collection M of (1/ε)O( (1/ε)) vectors of multiplicities, such that for any PBD p there exists a PBD q with O((1/ε)) distinct parameters whose multiplicities are given by some element of M, such that q is ε-close to p. Our proof combines tools from Fourier analysis and algebraic geometry. Our approach to the proper learning problem is as follows: Starting with an accurate non-proper hypothesis, we fit a PBD to this hypothesis. More specifically, we essentially start with the hypothesis computed by the computationally efficient non-proper learning algorithm in our recent work~DKS15. Our aforementioned structural characterization allows us to reduce the corresponding fitting problem to a collection of (1/ε)O( (1/ε)) systems of low-degree polynomial inequalities. We show that each such system can be solved in time (1/ε)O( (1/ε)), which yields the overall running time of our algorithm.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…