Ordered and size-biased frequencies in GEM and Gibbs models for species sampling
Abstract
We describe the distribution of frequencies ordered by sample values in a random sample of size n from the two parameter GEM(α,θ) random discrete distribution on the positive integers. These frequencies are a (size-α)-biased random permutation of the sample frequencies in either ranked order, or in the order of appearance of values in the sampling process. This generalizes a well known identity in distribution due to Donnelly and Tavar\'e (1986) for α = 0 to the case 0 α < 1. This description extends to sampling from Gibbs(α) frequencies obtained by suitable conditioning of the GEM(α,θ) model, and yields a value-ordered version of the Chinese Restaurant construction of GEM(α,θ) and Gibbs(α) frequencies in the more usual size-biased order of their appearance. The proofs are based on a general construction of a finite sample (X1,…,Xn) from any random frequencies in size-biased order from the associated exchangeable random partition ∞ of N which they generate.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.