Beyond classification: Gene-family phylogenies from shotgun metagenomic reads enable accurate community analysis

6Citations
Citations of this article
98Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: Sequence-based phylogenetic trees are a well-established tool for characterizing diversity of both macroorganisms and microorganisms. Phylogenetic methods have recently been applied to shotgun metagenomic data from microbial communities, particularly with the aim of classifying reads. But the accuracy of gene-family phylogenies that characterize evolutionary relationships among short, non-overlapping sequencing reads has not been thoroughly evaluated.Results: To quantify errors in metagenomic read trees, we developed MetaPASSAGE, a software pipeline to generate in silico bacterial communities, simulate a sample of shotgun reads from a gene family represented in the community, orient or translate reads, and produce a profile-based alignment of the reads from which a gene-family phylogenetic tree can be built. We applied MetaPASSAGE to a variety of RNA and protein-coding gene families, built trees using a range of different phylogenetic methods, and compared the resulting trees using topological and branch-length error metrics. We identified read length as one of the major sources of error. Because phylogenetic methods use a reference database of full-length sequences from the gene family to guide construction of alignments and trees, we found that error can also be substantially reduced through increasing the size and diversity of the reference database. Finally, UniFrac analysis, which compares metagenomic samples based on a summary statistic computed over all branches in a read tree, is very robust to the level of error we observe.Conclusions: Bacterial community diversity can be quantified using phylogenetic approaches applied to shotgun metagenomic data. As sequencing reads get longer and more genomes across the bacterial tree of life are sequenced, the accuracy of this approach will continue to improve, opening the door to more applications. © 2013 Riesenfeld and Pollard; licensee BioMed Central Ltd.

Author supplied keywords

Cite

CITATION STYLE

APA

Riesenfeld, S. J., & Pollard, K. S. (2013). Beyond classification: Gene-family phylogenies from shotgun metagenomic reads enable accurate community analysis. BMC Genomics, 14(1). https://doi.org/10.1186/1471-2164-14-419

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free