Extremes and gaps in sampling from a GEM random discrete distribution
Abstract
We show that in a sample of size n from a GEM(0,θ) random discrete distribution, the gaps Gi:n:= Xn-i+1:n - Xn-i:n between order statistics X1:n ·s Xn:n of the sample, with the convention Gn:n := X1:n - 1, are distributed like the first n terms of an infinite sequence of independent geometric(i/(i+θ)) variables Gi. This extends a known result for the minimum X1:n to other gaps in the range of the sample, and implies that the maximum Xn:n has the distribution of 1 + Σi=1n Gi, hence the known result that Xn:n grows like θ(n) as n∞, with an asymptotically normal distribution. Other consequences include most known formulas for the exact distributions of GEM(0,θ) sampling statistics, including the Ewens and Donnelly--Tavar\'e sampling formulas. For the two-parameter GEM(α,θ) distribution we show that the maximal value grows like a random multiple of nα/(1-α) and find the limit distribution of the multiplier.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.