Coresets for k-Means and k-Median Clustering and their Applications

Abstract

I\!-0.025em R S In this paper, we show the existence of small coresets for the problems of computing k-median and k-means clustering for points in low dimension. In other words, we show that given a point set P in d, one can compute a weighted set ⊂eq P, of size O(k -d n), such that one can compute the k-median/means clustering on instead of on P, and get an (1+)-approximation. As a result, we improve the fastest known algorithms for (1+)-approximate k-means and k-median clustering. Our algorithms have linear running time for a fixed k and . In addition, we can maintain the (1+)-approximate k-median or k-means clustering of a stream when points are being only inserted, using polylogarithmic space and update time.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…