Importance Sampling Approximation of Sequence Evolution Models with Site-Dependence
Abstract
We consider models for molecular sequence evolution in which the transition rates at each site depend on the local sequence context, giving rise to a time-inhomogeneous Markov process in which sites evolve under a complex dependency structure. We introduce a randomized approximation algorithm for the marginal sequence likelihood under these models using importance sampling, and provide matching order upper and lower bounds on the finite sample approximation error. Given two sequences of length n with r observed mutations, we show that for practical regimes of r/n, the complexity of the importance sampler does not grow exponentially n, but rather in r, making the algorithm practical for many applied problems. We demonstrate the use of our techniques to obtain problem-specific complexity bounds for a well-known dependent-site model from the phylogenetics literature.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.