The Construction of Near-optimal Universal Coding of Integers

Abstract

The Universal Coding of Integers~(UCI) is suitable for discrete memoryless sources with unknown probability distributions and infinitely countable alphabet sizes. A UCI is a class of prefix codes for which the ratio of the average codeword length to \1,H(P)\ is within a constant expansion factor redCC for any decreasing probability distribution P, where H(P) is the entropy of P. For any UCI code C, the minimum expansion factor redCC* is defined to represent the infimum of the set of extension factors of C. Each C has a unique corresponding redCC*, and the smaller redCC* is, the better the compression performance of C is. The class of UCIs C (or a family \Ci\i=1∞) that achieves the smallest redCC* is defined as the optimal UCI. The best current result is that the range of CC* for the optimal UCI is 2≤ CC*≤ 2.5. In this paper, we prove a tighter probability inequality for decreasing distributions, which serves as a new tool for studying the properties of UCIs. On the basis of this inequality, we prove that there exists a class of near-optimal UCIs, called the ν code, achieving redCν=2.0386. This narrows the range of the minimum expansion factor for the optimal UCI to 2≤ CC*≤ 2.0386. We show that the ν code is currently optimal in terms of the minimum expansion factor. In addition, we propose a new proof showing that the minimum expansion factor of the optimal UCI is lower bounded by 2.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…