TreeShrink: Efficient detection of outlier tree leaves

Uyen Mai; Siavash Mirarab

Conference Proceedings

TreeShrink: Efficient detection of outlier tree leaves

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10562 LNBI 116-140

DOI: 10.1007/978-3-319-67979-2_7

8Citations

13Readers

Get full text

Abstract

Phylogenetic trees include errors for a variety of reasons. We argue that one way to detect errors is to build a phylogeny with all the data then detect taxa that artificially inflate the tree diameter. We formulate an optimization problem that seeks to find k leaves that can be removed to reduce the tree diameter maximally. We present a polynomial time solution to this “k-shrink” problem. Given this solution, we then use non-parametric statistics to find an outlier set of taxa that have an unexpectedly high impact on the tree diameter. We test our method, TreeShrink, on five biological datasets, and show that it is more conservative than rogue taxon removal using RogueNaRok. When the amount of filtering is controlled, TreeShrink outperforms RogueNaRok in three out of the five datasets, and they tie in another dataset.

Author supplied keywords

Cite

CITATION STYLE

APA

Mai, U., & Mirarab, S. (2017). TreeShrink: Efficient detection of outlier tree leaves. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10562 LNBI, pp. 116–140). Springer Verlag. https://doi.org/10.1007/978-3-319-67979-2_7

TreeShrink: Efficient detection of outlier tree leaves

Abstract

Author supplied keywords

Cite

Register to see more suggestions