The Wasserstein cost of Importance Sampling
Abstract
Importance sampling (IS) consists in biasing samples from a distribution f towards another distribution g. Concretely, given samples Xi from f, the IS measure is gn = 1ZnΣi=1n g(Xi)f(Xi) δXi, with Zn = Σi=1n g(Xi)f(Xi). The random measure gn approximates g, and is used in many contexts ranging from Monte Carlo integration to Bayesian inference. We show that, in high dimension (d ≥slant 3), the Wasserstein cost Wpp(gn, g) has order n-p/d in expectation, i.e. βlowp,d∫ gf-p/d≤slant n ∞ np/d E[Wpp(gn, g)] ≤slant n ∞ np/d E[Wpp(gn, g)] ≤slantβp,d ∫ g f-p/d where 0<βlowp,d≤slant βp,d are constants depending only on p and d, which are equal for p=2 and conjectured to be equal for any p≥slant 1. Our results are valid for all p≥slant 1 and d≥slant 3. In the case where βlowp,d = βp,d, we show that the asymptotically optimal sampling distribution f* for importance sampling is not equal to g but to a tempered version of g, namely f* gd/(p+d), which is reminiscent of Zador's theorem in the domain of measure quantization.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.