The Streaming k-Mismatch Problem: Tradeoffs between Space and Total Time
Abstract
We revisit the k-mismatch problem in the streaming model on a pattern of length m and a streaming text of length n, both over a size-σ alphabet. The current state-of-the-art algorithm for the streaming k-mismatch problem, by Clifford et al. [SODA 2019], uses O(k) space and O( k) worst-case time per character. The space complexity is known to be (unconditionally) optimal, and the worst-case time per character matches a conditional lower bound. However, there is a gap between the total time cost of the algorithm, which is O(n k), and the fastest known offline algorithm, which costs O(n + (nk m,σ n)) time. Moreover, it is not known whether improvements over the O(n k) total time are possible when using more than O(k) space. We address these gaps by designing a randomized streaming algorithm for the k-mismatch problem that, given an integer parameter k s m, uses O(s) space and costs O(n+( nk2m,nk s,σ nms)) total time. For s=m, the total runtime becomes O(n + (nk m,σ n)), which matches the time cost of the fastest offline algorithm. Moreover, the worst-case time cost per character is still O( k).
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.