Efficient Density Estimation via Piecewise Polynomial Approximation

Abstract

We give a highly efficient "semi-agnostic" algorithm for learning univariate probability distributions that are well approximated by piecewise polynomial density functions. Let p be an arbitrary distribution over an interval I which is τ-close (in total variation distance) to an unknown probability distribution q that is defined by an unknown partition of I into t intervals and t unknown degree-d polynomials specifying q over each of the intervals. We give an algorithm that draws O(t(d+1)/2) samples from p, runs in time (t,d,1/), and with high probability outputs a piecewise polynomial hypothesis distribution h that is (O(τ)+)-close (in total variation distance) to p. This sample complexity is essentially optimal; we show that even for τ=0, any algorithm that learns an unknown t-piecewise degree-d probability distribution over I to accuracy must use ( t(d+1) (1 + (d+1)) · 1 2) samples from the distribution, regardless of its running time. Our algorithm combines tools from approximation theory, uniform convergence, linear programming, and dynamic programming. We apply this general algorithm to obtain a wide range of results for many natural problems in density estimation over both continuous and discrete domains. These include state-of-the-art results for learning mixtures of log-concave distributions; mixtures of t-modal distributions; mixtures of Monotone Hazard Rate distributions; mixtures of Poisson Binomial Distributions; mixtures of Gaussians; and mixtures of k-monotone densities. Our general technique yields computationally efficient algorithms for all these problems, in many cases with provably optimal sample complexities (up to logarithmic factors) in all parameters.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…