Simple Linear-time Repetition Factorization

Abstract

A factorization f1, …, fm of a string w of length n is called a repetition factorization of w if fi is a repetition, i.e., fi is a form of xkx', where x is a non-empty string, x' is a (possibly-empty) proper prefix of x, and k ≥ 2. Dumitran et al. [SPIRE 2015] presented an O(n)-time and space algorithm for computing an arbitrary repetition factorization of a given string of length n. Their algorithm heavily relies on the Union-Find data structure on trees proposed by Gabow and Tarjan [JCSS 1985] that works in linear time on the word RAM model, and an interval stabbing data structure of Schmidt [ISAAC 2009]. In this paper, we explore more combinatorial insights into the problem, and present a simple algorithm to compute an arbitrary repetition factorization of a given string of length n in O(n) time, without relying on data structures for Union-Find and interval stabbing. Our algorithm follows the approach by Inoue et al. [ToCS 2022] that computes the smallest/largest repetition factorization in O(n n) time.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…