The ∞-Cophenetic Metric for Phylogenetic Trees as an Interleaving Distance
Abstract
There are many metrics available to compare phylogenetic trees since this is a fundamental task in computational biology. In this paper, we focus on one such metric, the ∞-cophenetic metric introduced by Cardona et al. This metric works by representing a phylogenetic tree with n labeled leaves as a point in Rn(n+1)/2 known as the cophenetic vector, then comparing the two resulting Euclidean points using the ∞ distance. Meanwhile, the interleaving distance is a formal categorical construction generalized from the definition of Chazal et al., originally introduced to compare persistence modules arising from the field of topological data analysis. We show that the ∞-cophenetic metric is an example of an interleaving distance. To do this, we define phylogenetic trees as a category of merge trees with some additional structure; namely labelings on the leaves plus a requirement that morphisms respect these labels. Then we can use the definition of a flow on this category to give an interleaving distance. Finally, we show that, because of the additional structure given by the categories defined, the map sending a labeled merge tree to the cophenetic vector is, in fact, an isometric embedding, thus proving that the ∞-cophenetic metric is, in fact, an interleaving distance.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.