Correct ordering in the Zipf-Poisson ensemble

Abstract

We consider a Zipf--Poisson ensemble in which Xi(Ni-α) for α>1 and N>0 and integers i 1. As N∞ the first n'(N) random variables have their proper order X1>X2>...>Xn' relative to each other, with probability tending to 1 for n' up to (AN/(N))1/(α+2) for an explicit constant A(α) 3/4. The rate N1/(α+2) cannot be achieved. The ordering of the first n'(N) entities does not preclude Xm>Xn' for some interloping m>n'. The first n" random variables are correctly ordered exclusive of any interlopers, with probability tending to 1 if n" (BN/(N))1/(α+2) for B<A. For a Zipf--Poisson model of the British National Corpus, which has a total word count of 100,000,000, our result estimates that the 72 words with the highest counts are properly ordered.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…