Least Squares Methods for Equidistant Tree Reconstruction

Abstract

UPGMA is a heuristic method identifying the least squares equidistant phylogenetic tree given empirical distance data among n taxa. We study this classic algorithm using the geometry of the space of all equidistant trees with n leaves, also known as the Bergman complex of the graphical matroid for the complete graph Kn. We show that UPGMA performs an orthogonal projection of the data onto a maximal cell of the Bergman complex. We also show that the equidistant tree with the least (Euclidean) distance from the data is obtained from such an orthogonal projection, but not necessarily given by UPGMA. Using this geometric information we give an extension of the UPGMA algorithm. We also present a branch and bound method for finding the best equidistant tree. Finally, we prove that there are distance data among n taxa which project to at least (n-1)! equidistant trees.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…