Space-efficient SLP encoding for O( N)-time random access
Abstract
A Straight-Line Program (SLP) G for a string T is a context-free grammar (CFG) that derives T only, which can be considered as a compressed representation of T. In this paper, we show how to encode G in n N + (n + n') (n+σ) + 4n - 2n' + o(n) bits to support random access queries of extracting T[p..q] in worst-case O( N + q - p) time, where N is the length of T, σ is the alphabet size, n is the number of variables in G and n' n is the number of symmetric centroid paths in the DAG representation for G. The time complexity is almost optimal because Verbin and Yu [CPM 2013] proved that O( N) term cannot be significantly improved in general with poly(n)-space data structures. We also present alternative encodings that achieve the same random access time with n N + n (n+σ) + 5n + n' + o(n) or n N + n (n+σ) + 5n - n' + σ + o(n+σ) bits of space.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.