Resampling Residuals on Phylogenetic Trees: Extended Results
Abstract
In this article the results of Waddell and Azad (2009) are extended. In particular, the geometric percentage mean standard deviation measure of the fit of distances to a phylogenetic tree is adjusted for the number of parameters fitted to the model. The formulae are also presented in their general form for any weight that is a function of the distance. The cell line gene expression data set of Ross et al. (2000) is reanalyzed. It is shown that ordinary least squares (OLS) is a much better fit to the data than a Neighbor Joining or BME tree. Residual resampling shows that cancer cell lines do indeed fit a tree fairly well and that the tree does have strong internal structure. Simulations show that least squares tree building methods, including OLS, are strong competitors with BME type methods for fitting model data, while real world examples often suggest the same conclusion.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.