Hidden Markov Models on Variable Blocks with a Modal Clustering Algorithm and Applications

Abstract

Motivated by high-throughput single-cell cytometry data with applications to vaccine development and immunological research, we consider statistical clustering in large-scale data that contain multiple rare clusters. We propose a new hierarchical mixture model, namely Hidden Markov Model on Variable Blocks (HMM-VB), and a new mode search algorithm called Modal Baum-Welch (MBW) for efficient clustering. Exploiting the widely accepted chain-like dependence among groups of variables in the cytometry data, we propose to treat the hierarchy of variable groups as a figurative time line and employ a HMM-type model, namely HMM-VB. We also propose to use mode-based clustering, aka modal clustering, and overcome the exponential computational complexity by MBW. In a series of experiments on simulated data HMM-VB and MBW have better performance than existing methods. We also apply our method to identify rare cell subsets in cytometry data and examine its strengths and limitations.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…