On k-means for segments and polylines
Abstract
We study the problem of k-means clustering in the space of straight-line segments in R2 under the Hausdorff distance. For this problem, we give a (1+ε)-approximation algorithm that, for an input of n segments, for any fixed k, and with constant success probability, runs in time O(n+ ε-O(k) + ε-O(k)· O(k) (ε-1)). The algorithm has two main ingredients. Firstly, we express the k-means objective in our metric space as a sum of algebraic functions and use the optimization technique of Vigneron~Vigneron14 to approximate its minimum. Secondly, we reduce the input size by computing a small size coreset using the sensitivity-based sampling framework by Feldman and Langberg~Feldman11, Feldman2020. Our results can be extended to polylines of constant complexity with a running time of O(n+ ε-O(k)).
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.