How to Achieve the Intended Aim of Deep Clustering Now, without Deep Learning

Abstract

Deep clustering (DC) is often quoted to have a key advantage over k-means clustering. Yet, this advantage is often demonstrated using image datasets only, and it is unclear whether it addresses the fundamental limitations of k-means clustering. Deep Embedded Clustering (DEC) learns a latent representation via an autoencoder and performs clustering based on a k-means-like procedure, while the optimization is conducted in an end-to-end manner. This paper investigates whether the deep-learned representation has enabled DEC to overcome the known fundamental limitations of k-means clustering, i.e., its inability to discover clusters of arbitrary shapes, varied sizes and densities. Our investigations on DEC have a wider implication on deep clustering methods in general. Notably, none of these methods exploit the underlying data distribution. We uncover that a non-deep learning approach achieves the intended aim of deep clustering by making use of distributional information of clusters in a dataset to effectively address these fundamental limitations.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…