Error Tree: A Tree Structure for Hamming & Edit Distances & Wildcards Matching
Abstract
Error Tree is a novel tree structure that is mainly oriented to solve the approximate pattern matching problems, Hamming and edit distances, as well as the wildcards matching problem. The input is a text of length n over a fixed alphabet of length , a pattern of length m, and k. The output is to find all positions that have ≤ k Hamming distance, edit distance, or wildcards matching with P. The algorithm proposes for Hamming distance and wildcards matching a tree structure that needs O(nlog knk!) words and takes O( mkk! + occ)(O(m + log knk! + occ) in the average case) of query time for any online/offline pattern, where occ is the number of outputs. As well, a tree structure of O(2knlog knk!) words and O( mkk! + 3kocc)(O(m + log knk! + 3kocc) in the average case) query time for edit distance for any online/offline pattern.