A Bayesian Hidden Semi-Markov Model with Covariate-Dependent State Duration Parameters for High-Frequency Environmental Data
Abstract
Environmental time series data observed at high frequencies can be studied with approaches such as hidden Markov and semi-Markov models (HMM and HSMM). HSMMs extend the HMM by explicitly modeling the time spent in each state. In a discrete-time HSMM, the duration in each state can be modeled with a zero-truncated Poisson distribution, where the duration parameter may be state-specific but constant in time. We extend the HSMM by allowing the state-specific duration parameters to vary in time and model them as a function of known covariates observed over a period of time leading up to a state transition. In addition, we propose a data subsampling approach given that high-frequency data can violate the conditional independence assumption of the HSMM. We apply the model to high-frequency data collected by an instrumented buoy in Lake Mendota. We model the phycocyanin concentration, which is used in aquatic systems to estimate the relative abundance of blue-green algae, and identify important time-varying effects associated with the duration in each state.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.