Compressed Subsequence Matching and Packed Tree Coloring
Abstract
We present a new algorithm for subsequence matching in grammar compressed strings. Given a grammar of size n compressing a string of size N and a pattern string of size m over an alphabet of size σ, our algorithm uses O(n+nσw) space and O(n+nσw+m N w· occ) or O(n+nσw w+m N· occ) time. Here w is the word size and occ is the number of occurrences of the pattern. Our algorithm uses less space than previous algorithms and is also faster for occ=o(n N) occurrences. The algorithm uses a new data structure that allows us to efficiently find the next occurrence of a given character after a given position in a compressed string. This data structure in turn is based on a new data structure for the tree color problem, where the node colors are packed in bit strings.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.