Single and multiple consecutive permutation motif search

Abstract

Let t be a permutation (that shall play the role of the text) on [n] and a pattern p be a sequence of m distinct integer(s) of [n], m≤ n. The pattern p occurs in t in position i if and only if p1... pm is order-isomorphic to ti... ti+m-1, that is, for all 1 ≤ k< ≤ m, pk>p if and only if ti+k-1>ti+-1. Searching for a pattern p in a text t consists in identifying all occurrences of p in t. We first present a forward automaton which allows us to search for p in t in O(m2 m +n) time. We then introduce a Morris-Pratt automaton representation of the forward automaton which allows us to reduce this complexity to O(m m +n) at the price of an additional amortized constant term by integer of the text. Both automata occupy O(m) space. We then extend the problem to search for a set of patterns and exhibit a specific Aho-Corasick like algorithm. Next we present a sub-linear average case search algorithm running in O(m m m+n mm m) time, that we eventually prove to be optimal on average.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…