Faster Approximate String Matching for Short Patterns

Abstract

We study the classical approximate string matching problem, that is, given strings P and Q and an error threshold k, find all ending positions of substrings of Q whose edit distance to P is at most k. Let P and Q have lengths m and n, respectively. On a standard unit-cost word RAM with word size w ≥ n we present an algorithm using time O(nk · (2 m n,2 m ww) + n) When P is short, namely, m = 2o( n) or m = 2o(w/ w) this improves the previously best known time bounds for the problem. The result is achieved using a novel implementation of the Landau-Vishkin algorithm based on tabulation and word-level parallelism.

0

Discussion (0)

Sign in to join the discussion.

Loading comments…