The price of incrementality in k-center clustering
Abstract
The k-center problem is one of the best-studied and most intuitive clustering formulations. It asks, given a set of n points in a metric space, for k of the points to be designated as cluster centers, so that the maximum distance of an input point to its nearest center is minimized. Gonzalez's greedy algorithm from 1985 is a simple and efficient way to find a 2-approximate solution. The algorithm has the attractive feature of incrementality: it outputs the centers one by one, with a guaranteed 2-approximation for every prefix of the obtained sequence of centers. Incrementality imposes a geometric constraint on how solutions can be built, and it is natural to ask whether this comes at a price in the quality of the solution. It is known that in polynomial time, the approximation ratio of 2 is best possible, assuming P ≠ NP. In this paper we show that even with unlimited computational power, the factor 2 cannot be improved, if the solution is required to be built incrementally. The lower bound construction imposes a tradeoff between all n levels of the clustering simultaneously; it was obtained with the help of ChatGPT, an aspect we discuss in Section 3 of the paper.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.