A fast implementation of the good-suffix array for the Boyer-Moore string matching algorithm
Abstract
String matching is the problem of finding all the occurrences of a pattern in a text. It has been intensively studied and the Boyer-Moore string matching algorithm is probably one of the most famous solution to this problem. This algorithm uses two precomputed shift tables called the good-suffix table and the bad-character table. The good-suffix table is tricky to compute in linear time. Text book solutions perform redundant operations. Here we present a fast implementation for this good-suffix table based on a tight analysis of the pattern. Experimental results show two versions of this new implementation are the fastest in almost all tested situations.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.