Range Predecessor and Lempel-Ziv Parsing
Abstract
The Lempel-Ziv parsing of a string (LZ77 for short) is one of the most important and widely-used algorithmic tools in data compression and string processing. We show that the Lempel-Ziv parsing of a string of length n on an alphabet of size σ can be computed in O(nσ) time (O(n) time if we allow randomization) using O(nσ) bits of working space; that is, using space proportional to that of the input string in bits. The previous fastest algorithm using O(nσ) space takes O(n(σ+ n)) time. We also consider the important rightmost variant of the problem, where the goal is to associate with each phrase of the parsing its most recent occurrence in the input string. We solve this problem in O(n(1 + (σ/ n)) time, using the same working space as above. The previous best solution for rightmost parsing uses O(n(1+σ/ n)) time and O(n n) space. As a bonus, in our solution for rightmost parsing we provide a faster construction method for efficient 2D orthogonal range reporting, which is of independent interest.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.