Kernel K-means clustering of distributional data

Abstract

We consider the problem of clustering a sample of probability distributions from a random distribution on Rp. Our proposed partitioning method makes use of a symmetric, positive-definite kernel k and its associated reproducing kernel Hilbert space (RKHS) H. By mapping each distribution to its corresponding kernel mean embedding in H, we obtain a sample in this RKHS where we carry out the K-means clustering procedure, which provides an unsupervised classification of the original sample. The procedure is simple and computationally feasible even for dimension p>1. The simulation studies provide insight into the choice of the kernel and its tuning parameter. The performance of the proposed clustering procedure is illustrated on a collection of Synthetic Aperture Radar (SAR) images.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…