A Fast Heuristic for Exact String Matching
Abstract
Given a pattern string P of length n consisting of δ distinct characters and a query string T of length m, where the characters of P and T are drawn from an alphabet of size , the exact string matching problem consists of finding all occurrences of P in T. For this problem, we present a randomized heuristic that in O(nδ) time preprocesses P to identify sparse(P), a rarely occurring substring of P, and then use it to find all occurrences of P in T efficiently. This heuristic has an expected search time of O( mmin(|sparse(P)|, )), where |sparse(P)| is at least δ. We also show that for a pattern string P whose characters are chosen uniformly at random from an alphabet of size , E[|sparse(P)|] is ( log (22-δ)).
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.