Stability of Density-Based Clustering

Abstract

High density clusters can be characterized by the connected components of a level set L(λ) = \x:\ p(x)>λ\ of the underlying probability density function p generating the data, at some appropriate level λ≥ 0. The complete hierarchical clustering can be characterized by a cluster tree T= λ L(λ). In this paper, we study the behavior of a density level set estimate L(λ) and cluster tree estimate T based on a kernel density estimator with kernel bandwidth h. We define two notions of instability to measure the variability of L(λ) and T as a function of h, and investigate the theoretical properties of these instability measures.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…