PpIacerDC: A new scalable phylogenetic placement method

6Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Motivation: Phylogenetic placement (i.e., the insertion of a sequence into a phylogenetic tree) is a basic step in several bioinformatics pipelines, including taxon identification in metagenomic analysis and large scale phylogeny estimation. The most accurate current method is pplacer, which attempts to optimize the placement using maximum likelihood, but it frequently fails on datasets where the phylogenetic tree has 5000 leaves. APPLES is the current most scalable method, and EPA-ng, although more scalable than pplacer and more accurate than APPLES, also fails on many 50,000-taxon trees. Here we describe pplacerDC, a divide-and-conquer approach that enables pplacer to be used when the phylogenetic tree is very large. Results: Our study shows that pplacerDC has excellent accuracy and scalability, matching pplacer where pplacer can run, improving accuracy compared to APPLES and EPA-ng, and is able to run on datasets with up to 100,000 sequences. Availability: The pplacerDC code is available on GitHub at https://github.com/kodingkoning/pplacerDC.

Author supplied keywords

Cite

CITATION STYLE

APA

Koning, E., Phillips, M., & Warnow, T. (2021). PpIacerDC: A new scalable phylogenetic placement method. In Proceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2021. Association for Computing Machinery, Inc. https://doi.org/10.1145/3459930.3469516

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free