Lempel-Ziv Complexity, Empirical Entropies, and Chain Rules
Abstract
We derive upper and lower bounds on the overall compression ratio of the 1978 Lempel-Ziv (LZ78) algorithm, applied independently to k-blocks of a finite individual sequence. Both bounds are given in terms of normalized empirical entropies of the given sequence. For the bounds to be tight and meaningful, the order of the empirical entropy should be small relative to k in the upper bound, but large relative to k in the lower bound. Several non-trivial conclusions arise from these bounds. One of them is a certain form of a chain rule of the Lempel-Ziv (LZ) complexity, which decomposes the joint LZ complexity of two sequences, say, and , into the sum of the LZ complexity of and the conditional LZ complexity of given (up to small terms). The price of this decomposition, however, is in changing the length of the block. Additional conclusions are discussed as well.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.