Memory Efficient And Minimax Distribution Estimation Under Wasserstein Distance Using Bayesian Histograms

Abstract

We study Bayesian histograms for distribution estimation on [0,1]d under the Wasserstein Wv, 1 ≤ v < ∞ distance in the i.i.d sampling regime. We newly show that when d < 2v, histograms possess a special memory efficiency property, whereby in reference to the sample size n, order nd/2v bins are needed to obtain minimax rate optimality. This result holds for the posterior mean histogram and with respect to posterior contraction: under the class of Borel probability measures and some classes of smooth densities. The attained memory footprint overcomes existing minimax optimal procedures by a polynomial factor in n; for example an n1 - d/2v factor reduction in the footprint when compared to the empirical measure, a minimax estimator in the Borel probability measure class. Additionally constructing both the posterior mean histogram and the posterior itself can be done super--linearly in n. Due to the popularity of the W1,W2 metrics and the coverage provided by the d < 2v case, our results are of most practical interest in the (d=1,v =1,2), (d=2,v=2), (d=3,v=2) settings and we provide simulations demonstrating the theory in several of these instances.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…