Posterior concentration and adaptation of the mixing measure in Dirichlet process mixtures
Abstract
We study the asymptotic properties of the posterior on the latent space for infinite mixtures driven by a Dirichlet process, both in terms of mixing measure and clustering behaviour. In the well-specified regime, where the data are generated by a finite mixture of location densities, we show that the posterior is adaptive to the true number of components K: indeed the cumulative mass assigned to weights of the stick-breaking representation beyond the K-th one vanishes as n-1/2, up to terms growing slower than any polynomial. This also implies a nearly optimal posterior contraction rate for the mixing measure in Wasserstein distance. A remarkable phase transition underlies this result: approximating the mixing measure to any precision finer than n-1/2 requires a number of components growing logarithmically with the sample size. We show that this has a profound impact on the clustering behaviour: the number of clusters grows logarithmically, as in the prior case, but the proportion of observations outside the K largest clusters vanishes polynomially fast. Finally, we turn these results into posterior guarantees for truncation-based approximations: while any truncation with at least K elements recovers the optimal contraction rates for both density and mixing measure, O( n) components are both necessary and sufficient to reproduce the clustering of the exact posterior.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.