A Fast Approximation Scheme for Low-Dimensional k-Means

Abstract

We consider the popular k-means problem in d-dimensional Euclidean space. Recently Friggstad, Rezapour, Salavatipour [FOCS'16] and Cohen-Addad, Klein, Mathieu [FOCS'16] showed that the standard local search algorithm yields a (1+ε)-approximation in time (n · k)1/εO(d), giving the first polynomial-time approximation scheme for the problem in low-dimensional Euclidean space. While local search achieves optimal approximation guarantees, it is not competitive with the state-of-the-art heuristics such as the famous k-means++ and D2-sampling algorithms. In this paper, we aim at bridging the gap between theory and practice by giving a (1+ε)-approximation algorithm for low-dimensional k-means running in time n · k · ( n)(dε-1)O(d), and so matching the running time of the k-means++ and D2-sampling heuristics up to polylogarithmic factors. We speed-up the local search approach by making a non-standard use of randomized dissections that allows to find the best local move efficiently using a quite simple dynamic program. We hope that our techniques could help design better local search heuristics for geometric problems. We note that the doubly exponential dependency on d is necessary as k-means is APX-hard in dimension d = ω( n).

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…