On Hardness of Jumbled Indexing
Abstract
Jumbled indexing is the problem of indexing a text T for queries that ask whether there is a substring of T matching a pattern represented as a Parikh vector, i.e., the vector of frequency counts for each character. Jumbled indexing has garnered a lot of interest in the last four years. There is a naive algorithm that preprocesses all answers in O(n2||) time allowing quick queries afterwards, and there is another naive algorithm that requires no preprocessing but has O(n||) query time. Despite a tremendous amount of effort there has been little improvement over these running times. In this paper we provide good reason for this. We show that, under a 3SUM-hardness assumption, jumbled indexing for alphabets of size ω(1) requires (n2-ε) preprocessing time or (n1-δ) query time for any ε,δ>0. In fact, under a stronger 3SUM-hardness assumption, for any constant alphabet size r 3 there exist describable fixed constant εr and δr such that jumbled indexing requires (n2-εr) preprocessing time or (n1-δr) query time.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.