Iterative Exploration-Driven Sparse SDP Clustering via Thompson Sampling

Abstract

This paper studies high-dimensional sparse clustering, a combinatorial NP-hard problem arising from the bilinear coupling between cluster assignment and feature selection. We analyze semidefinite programming (SDP) relaxations of K-means and establish minimax separation bounds, demonstrating that these relaxations are theoretically robust to feature over-selection: exact recovery is preserved even in the presence of non-informative features. Leveraging this robustness, we propose a block-coordinate ascent framework that alternates between SDP-based clustering and non-conservative feature selection. To address the tendency of deterministic greedy methods to become trapped in local optima, we formulate the feature selection step as a Thompson sampling bandit problem. This approach introduces adaptive memory by aggregating historical variable-selection outcomes into posterior distributions, and selects features via posterior sampling, enabling stochastic exploration that promotes the inclusion of under-explored features and facilitates escape from local maxima. We establish conditions for consistent variable selection and exact clustering recovery, and extend the method to settings with unknown covariance through a scalable, inverse-free estimation procedure. Numerical experiments demonstrate that the proposed memory-driven approach consistently outperforms state-of-the-art sparse clustering methods.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…