A Wasserstein index of dependence for random measures
Abstract
Optimal transport and Wasserstein distances are flourishing in many scientific fields as a means for comparing and connecting random structures. Here we pioneer the use of an optimal transport distance between L\'evy measures to solve a statistical problem. Dependent Bayesian nonparametric models provide flexible inference on distinct, yet related, groups of observations. Each component of a vector of random measures models a group of exchangeable observations, while their dependence regulates the borrowing of information across groups. We derive the first statistical index of dependence in [0,1] for (completely) random measures that accounts for their whole infinite-dimensional distribution, which is assumed to be equal across different groups. This is accomplished by using the geometric properties of the Wasserstein distance to solve a max-min problem at the level of the underlying L\'evy measures. The Wasserstein index of dependence sheds light on the models' deep structure and has desirable properties: (i) it is 0 if and only if the random measures are independent; (ii) it is 1 if and only if the random measures are completely dependent; (iii) it simultaneously quantifies the dependence of d 2 random measures, avoiding the need for pairwise comparisons; (iv) it can be evaluated numerically. Moreover, the index allows for informed prior specifications and fair model comparisons for Bayesian nonparametric models.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.