Fast Computation of k-Runs, Parameterized Squares, and Other Generalised Squares

Abstract

A k-mismatch square is a string of the form XY where X and Y are two equal-length strings that have at most k mismatches. Kolpakov and Kucherov [Theor. Comput. Sci., 2003] defined two notions of k-mismatch repeats, called k-repetitions and k-runs, each representing a sequence of consecutive k-mismatch squares of equal length. They proposed algorithms for computing k-repetitions and k-runs working in O(nk k + output) time for a string of length n over an integer alphabet, where output is the number of the reported repeats. We show that output=O(nk k), both in case of k-repetitions and k-runs, which implies that the complexity of their algorithms is actually O(nk k). We apply this result to computing parameterized squares. A parameterized square is a string of the form XY such that X and Y parameterized-match, i.e., there exists a bijection f on the alphabet such that f(X) = Y. Two parameterized squares XY and X'Y' are equivalent if they parameterized match. Recently Hamai et al. [SPIRE 2024] showed that a string of length n over an alphabet of size σ contains less than nσ non-equivalent parameterized squares, improving an earlier bound by Kociumaka et al. [Theor. Comput. Sci., 2016]. We apply our bound for k-mismatch repeats to propose an algorithm that reports all non-equivalent parameterized squares in O(nσ σ) time. We also show that the number of non-equivalent parameterized squares can be computed in O(n n) time. This last algorithm applies to squares under any substring compatible equivalence relation and also to counting squares that are distinct as strings. In particular, this improves upon the O(nσ)-time algorithm of Gawrychowski et al. [CPM 2023] for counting order-preserving squares that are distinct as strings if σ = ω( n).

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…