Optimal pattern matching algorithms
Abstract
We study a class of finite state machines, called w-matching machines, which yield to simulate the behavior of pattern matching algorithms while searching for a pattern w. They can be used to compute the asymptotic speed, i.e. the limit of the expected ratio of the number of text accesses to the length of the text, of algorithms while parsing an iid text to find the pattern w. Defining the order of a matching machine or of an algorithm as the maximum difference between the current and accessed positions during a search (standard algorithms are generally of order |w|), we show that being given a pattern w, an order k and an iid model, there exists an optimal w-matching machine, i.e. with the greatest asymptotic speed under the model among all the machines of order k, of which the set of states belongs to a finite and enumerable set. It shows that it is possible to determine: 1) the greatest asymptotic speed among a large class of algorithms, with regard to a pattern and an iid model, and 2) a w-matching machine, thus an algorithm, achieving this speed.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.