Computing Exact Clustering Posteriors with Subset Convolution

Abstract

An exponential-time exact algorithm is provided for the task of clustering n items of data into k clusters. Instead of seeking one partition, posterior probabilities are computed for summary statistics: the number of clusters, and pairwise co-occurrence. The method is based on subset convolution, and yields the posterior distribution for the number of clusters in O(n * 3n) operations, or O(n3 * 2n) using fast subset convolution. Pairwise co-occurrence probabilities are then obtained in O(n3 * 2n) operations. This is considerably faster than exhaustive enumeration of all partitions.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…