The landscape of compressibility measures for two-dimensional data
Abstract
In this paper we extend to two-dimensional data two recently introduced one-dimensional compressibility measures: the γ measure defined in terms of the smallest string attractor, and the δ measure defined in terms of the number of distinct substrings of the input string. Concretely, we introduce the two-dimensional measures γ2D and δ2D, as natural generalizations of γ and δ, and we initiate the study of their properties. Among other things, we prove that δ2D is monotone and can be computed in linear time, and we show that, although it is still true that δ2D ≤ γ2D, the gap between the two measures can be (n) and therefore asymptotically larger than the gap between γ and δ. To complete the scenario of two-dimensional compressibility measures, we introduce the measure b2D which generalizes to two dimensions the notion of optimal parsing. We prove that, somewhat surprisingly, the relationship between b2D and γ2D is significantly different than in the one-dimensional case. As an application of our results we provide the first analysis of the space usage of the two-dimensional block tree introduced in [Brisaboa et al., Two-dimensional block trees, The computer Journal, 2024]. Our analysis shows that the space usage can be bounded in terms of both γ2D and δ2D. Finally, using insights from our analysis, we design the first linear time and space algorithm for constructing the two-dimensional block tree for arbitrary matrices.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.