Zip-Tries: Simple Dynamic Data Structures for Strings

Abstract

In this paper, we introduce zip-tries, which are simple, dynamic, memory-efficient data structures for strings. Zip-tries support search and update operations for k-length strings in O(k+ n) time in the standard RAM model or in O(k/α+ n) time in the word RAM model, where α is the length of the longest string that can fit in a memory word, and n is the number of strings in the trie. Importantly, we show how zip-tries can achieve this while only requiring O(n + kα) bits of metadata per node w.h.p., which is an exponential improvement over previous results for long strings. Despite being considerably simpler and more memory efficient, we show how zip-tries perform competitively with state-of-the-art data structures on large datasets of long strings. Furthermore, we provide a simple, general framework for parallelizing string comparison operations in linked data structures, which we apply to zip-tries to obtain parallel zip-tries. Parallel zip-tries are able to achieve good search and update performance in parallel, performing such operations in O(n) span. We also apply our techniques to an existing external-memory string data structure, the string B-tree, obtaining a parallel string B-tree which performs search operations using O(Bn) I/O span and O(kα B + Bn) I/O work in the parallel external memory (PEM) model. The parallel string B-tree can perform prefix searches using only O(nn) span under the practical PRAM model. For the case of long strings that share short common prefixes, we provide LCP-aware variants of all our algorithms that should be quite efficient in practice, which we justify empirically.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…