Approximate Hamming distance in a stream

Abstract

We consider the problem of computing a (1+ε)-approximation of the Hamming distance between a pattern of length n and successive substrings of a stream. We first look at the one-way randomised communication complexity of this problem, giving Alice the first half of the stream and Bob the second half. We show the following: (1) If Alice and Bob both share the pattern then there is an O(ε-4 2 n) bit randomised one-way communication protocol. (2) If only Alice has the pattern then there is an O(ε-2n n) bit randomised one-way communication protocol. We then go on to develop small space streaming algorithms for (1+ε)-approximate Hamming distance which give worst case running time guarantees per arriving symbol. (1) For binary input alphabets there is an O(ε-3 n 2 n) space and O(ε-2 n) time streaming (1+ε)-approximate Hamming distance algorithm. (2) For general input alphabets there is an O(ε-5 n 4 n) space and O(ε-4 3 n) time streaming (1+ε)-approximate Hamming distance algorithm.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…