On the complexity and approximability of Bounded access Lempel Ziv coding
Abstract
We study the complexity of constructing an optimal parsing of a string s = s1 … sn under the constraint that given a position p in the original text, and the LZ76-like (Lempel Ziv 76) encoding of T based on , it is possible to identify/decompress the character sp by performing at most c accesses to the LZ encoding, for a given integer c. We refer to such a parsing as a c-bounded access LZ parsing or c-BLZ parsing of s. We show that for any constant c the problem of computing the optimal c-BLZ parsing of a string, i.e., the one with the minimum number of phrases, is NP-hard and also APX hard, i.e., no PTAS can exist under the standard complexity assumption P ≠ NP. We also study the ratio between the sizes of an optimal c-BLZ parsing of a string s and an optimal LZ76 parsing of s (which can be greedily computed in polynomial time).
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.