A generalization for the expected value of the earth mover's distance
Abstract
The earth mover's distance (EMD), also called the first Wasserstein distance, can be naturally extended to compare arbitrarily many probability distributions, rather than only two, on the set [n]=\1,…,n\. We present the details for this generalization, along with a highly efficient algorithm inspired by combinatorics; it turns out that in the special case of three distributions, the EMD is half the sum of the pairwise EMD's. Extending the methods of Bourn and Willenbring (arXiv:1903.03673), we compute the expected value of this generalized EMD on random d-tuples of distributions, using a generating function which coincides with the Hilbert series of the Segre embedding. We then use the EMD to analyze a real-world data set of grade distributions.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.