On principles of large deviation and selected data compression
Abstract
The Shannon Noiseless coding theorem (the data-compression principle) asserts that for an information source with an alphabet X=\0,… , -1\ and an asymptotic equipartition property, one can reduce the number of stored strings (x0,… ,xn-1)∈ Xn to nh with an arbitrary small error-probability. Here h is the entropy rate of the source (calculated to the base ). We consider further reduction based on the concept of utility of a string measured in terms of a rate of a weight function. The novelty of the work is that the distribution of memory is analyzed from a probabilistic point of view. A convenient tool for assessing the degree of reduction is a probabilistic large deviation principle. Assuming a Markov-type setting, we discuss some relevant formulas, including the case of a general alphabet.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.