An introduction to infinite HMMs for single molecule data analysis
Abstract
The hidden Markov model (HMM) has been a workhorse of single molecule data analysis and is now commonly used as a standalone tool in time series analysis or in conjunction with other analyses methods such as tracking. Here we provide a conceptual introduction to an important generalization of the HMM which is poised to have a deep impact across Biophysics: the infinite hidden Markov model (iHMM). As a modeling tool, iHMMs can analyze sequential data without a priori setting a specific number of states as required for the traditional (finite) HMM. While the current literature on the iHMM is primarily intended for audiences in Statistics, the idea is powerful and the iHMM's breadth in applicability outside Machine Learning and Data Science warrants a careful exposition. Here we explain the key ideas underlying the iHMM with a special emphasis on implementation and provide a description of a code we are making freely available. In a companion article, we provide an important extension of the iHMM to accommodate complications such as drift.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.