Nonparametric Modeling of Continuous-Time Markov Chains

Abstract

Inferring the infinitesimal rates of continuous-time Markov chains (CTMCs) is a central challenge in many scientific domains. This task is difficult because the number of rates grows quadratically with the state space, rates can be strongly dependent, and many transitions may be only partially observed. We introduce a Bayesian framework that models CTMC rates as flexible functions of covariates through Gaussian processes. This enables nonlinear covariate effects, improves inference by incorporating external information, and helps identify potential drivers of CTMC dynamics. For posterior inference, we use Hamiltonian Monte Carlo and develop scalable exact and approximate gradients for likelihoods involving repeated matrix exponentials. With N observations and K CTMC states, these gradients reduce the dominant cost of existing derivative calculations from O(NK3), with large constants, to O(K3+NK2), with cheaper constants. We demonstrate the method in Bayesian phylogenetic and phylogeographic inference, where CTMCs are central, and show strong performance on synthetic and real datasets, including empirical quadratic scaling in K even when N<K.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…