Coresets for Constrained Clustering: General Assignment Constraints and Improved Size Bounds
Abstract
Designing small-sized coresets, which approximately preserve the costs of the solutions for large datasets, has been an important research direction for the past decade. We consider coreset construction for a variety of general constrained clustering problems. We introduce a general class of assignment constraints, including capacity constraints on cluster centers, and assignment structure constraints for data points (modeled by a convex body B). We give coresets for clustering problems with such general assignment constraints that significantly generalize and improve known results. Notable implications include the first -coreset for capacitated and fair k-Median with m outliers in Euclidean spaces whose size is O(m + k2 -4), generalizing and improving upon the prior bounds in [Braverman et al., FOCS' 22; Huang et al., ICLR' 23] (for capacitated k-Median, the coreset size bound obtained in [Braverman et al., FOCS' 22] is O(k3 -6), and for k-Median with m outliers, the coreset size bound obtained in [Huang et al., ICLR' 23] is O(m + k3 -5)), and the first ε-coreset of size poly(k -1) for fault-tolerant clustering for various types of metric spaces.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.