Coresets for (k,l)-Clustering under the Fr\'echet Distance
Abstract
Clustering is the task of partitioning a given set of geometric objects. This is thoroughly studied when the objects are points in the euclidean space. There are also several approaches for points in general metric spaces. In this thesis we consider clustering polygonal curves, i.e., curves composed of line segments, under the Fr\'echet distance. We obtain clusterings by minimizing an objective function, which yields a set of centers that induces a partition of the input. The objective functions we consider is the so called (k,l)-center, where we are to find the k center-curves that minimize the maximum distance between any input-curve and a nearest center-curve and the so called k-median, where we are to find the k center-curves that minimize the sum of the distances between the input-curves and a nearest center-curve. Given a set of n polygonal curves, we are interested in reducing this set to an ε-coreset, i.e., a notably smaller set of curves that has a very similar clustering-behavior. We develop a construction method for such ε-coresets for the (k,l)-center, that yields ε-coresets of size of a polynomial of 1ε, in time linear in n and a polynomial of 1ε, for line segments. Also, we develop a construction technique for the (k,l)-center that yields ε-coresets of size exponential in m with basis 1ε, in time sub-quadratic in n and exponential in m with basis 1ε, for general polygonal curves. Finally, we develop a construction method for the k-median, that yields ε-coresets of size polylogarithmic in n and a polynomial of 1ε, in time linear in n and a polynomial of 1ε.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.