Linear-time computation of generalized minimal absent words for multiple strings
Abstract
A string w is called a minimal absent word (MAW) for a string S if w does not occur as a substring in S and all proper substrings of w occur in S. MAWs are well-studied combinatorial string objects that have potential applications in areas including bioinformatics, musicology, and data compression. In this paper, we generalize the notion of MAWs to a set S = \S1, …, Sk\ of multiple strings. We first describe our solution to the case of k = 2 strings, and show how to compute the set M of MAWs in optimal O(n + |M|) time and with O(n) working space, where n denotes the total length of the strings in S. We then move on to the general case of k > 2 strings, and show how to compute the set M of MAWs in O(n k / n + |M|) time and with O(n (k + n)) bits of working space, in the word RAM model with machine word size ω = n. The latter algorithm runs in optimal O(n + |M|) time for k = O( n).
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.