Faster Approximate(d) Text-to-Pattern L1 Distance

Abstract

The problem of finding distance between pattern of length m and text of length n is a typical way of generalizing pattern matching to incorporate dissimilarity score. For both Hamming and L1 distances only a super linear upper bound O(nm) are known, which prompts the question of relaxing the problem: either by asking for (1 ) approximate distance (every distance is reported up to a multiplicative factor), or k-approximated distance (distances exceeding k are reported as ∞). We focus on L1 distance, for which we show new algorithms achieving complexities respectively O(-1 n) and O((m+km) · n/m). This is a significant improvement upon previous algorithms with runtime O(-2 n) of Lipsky and Porat [Algorithmica 2011] and O(nk) of Amir, Lipsky, Porat and Umanski [CPM 2005].

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…