Parameterized Complexity of Categorical Clustering with Size Constraints
Abstract
In the Categorical Clustering problem, we are given a set of vectors (matrix) A=a1,…,an over m, where is a finite alphabet, and integers k and B. The task is to partition A into k clusters such that the median objective of the clustering in the Hamming norm is at most B. That is, we seek a partition I1,…,Ik of 1,…,n and vectors c1,…,ck∈m such that Σi=1kΣj∈ Iidh(ci,aj)≤ B, where dH(a,b) is the Hamming distance between vectors a and b. Fomin, Golovach, and Panolan [ICALP 2018] proved that the problem is fixed-parameter tractable (for binary case =0,1) by giving an algorithm that solves the problem in time 2O(B B) (mn)O(1). We extend this algorithmic result to a popular capacitated clustering model, where in addition the sizes of the clusters should satisfy certain constraints. More precisely, in Capacitated Clustering, in addition, we are given two non-negative integers p and q, and seek a clustering with p≤ |Ii|≤ q for all i∈1,…,k. Our main theorem is that Capacitated Clustering is solvable in time 2O(B B)||B(mn)O(1). The theorem not only extends the previous algorithmic results to a significantly more general model, it also implies algorithms for several other variants of Categorical Clustering with constraints on cluster sizes.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.