On the Benefit of Merging Suffix Array Intervals for Parallel Pattern Matching
Abstract
We present parallel algorithms for exact and approximate pattern matching with suffix arrays, using a CREW-PRAM with p processors. Given a static text of length n, we first show how to compute the suffix array interval of a given pattern of length m in O(mp+ p + p· n) time for p m. For approximate pattern matching with k differences or mismatches, we show how to compute all occurrences of a given pattern in O(mkσkp(k, n)\!+\!(1+mp) p· n + occ) time, where σ is the size of the alphabet and p σk mk. The workhorse of our algorithms is a data structure for merging suffix array intervals quickly: Given the suffix array intervals for two patterns P and P', we present a data structure for computing the interval of PP' in O( n) sequential time, or in O(1+p n) parallel time. All our data structures are of size O(n) bits (in addition to the suffix array).