Confidence sets for phylogenetic trees

Abstract

Inferring evolutionary histories (phylogenetic trees) has important applications in biology, criminology and public health. However, phylogenetic trees are complex mathematical objects that reside in a non-Euclidean space, which complicates their analysis. While our mathematical, algorithmic, and probabilistic understanding of phylogenies in their metric space is mature, rigorous inferential infrastructure is as yet undeveloped. In this manuscript we unify recent computational and probabilistic advances to construct tree--valued confidence sets. The procedure accounts for both centre and multiple directions of tree--valued variability. We draw on block replicates to improve testing, identifying the best supported most recent ancestor of the Zika virus, and formally testing the hypothesis that a Floridian dentist with AIDS infected two of his patients with HIV. The method illustrates connections between variability in Euclidean and tree space, opening phylogenetic tree analysis to techniques available in the multivariate Euclidean setting.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…