C2: Cache-Conscious Succinct Tries with Adaptive Unary Path Compression
Abstract
Succinct tries are powerful string dictionaries because of their low memory footprint and fast query performance. However, existing succinct trie implementations face two key challenges to spatial locality: 1) they incur unnecessary cache misses during queries, especially during trie navigation operations, and 2) they waste significant space when the data contains many unary paths. We propose C2, a set of two techniques: C1 introduces a more cache-friendly layout for the underlying succinct tries, and C2 compresses redundant unary paths. We thoroughly redesign three state-of-the-art succinct tries: FST, CoCo-trie, and Marisa, producing C2-FST, C2-CoCo, and C2-Marisa. Experiments on six diverse datasets show that the C1 optimization improves query performance by 1.58x, 1.12x, and 1.42x, respectively, compared to the original FST, CoCo-trie, and Marisa. Furthermore, the C2 optimization achieves a 1.3x smaller memory footprint on average. The succinct tries optimized with both aspects of C2 achieve better space-time tradeoffs than their original versions and other state-of-the-art succinct tries, while using significantly less space than non-succinct tries like ART and C-ART.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.