Pooled assembly of marine metagenomic datasets: Enriching annotation through chimerism

6Citations
Citations of this article
74Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Motivation: Despite advances in high-throughput sequencing, marine metagenomic samples remain largely opaque. A typical sample contains billions of microbial organisms from thousands of genomes and quadrillions of DNA base pairs. Its derived metagenomic dataset underrepresents this complexity by orders of magnitude because of the sparseness and shortness of sequencing reads. Read shortness and sequencing errors pose a major challenge to accurate species and functional annotation. This includes distinguishing known from novel species. Often the majority of reads cannot be annotated and thus cannot help our interpretation of the sample. Results: Here, we demonstrate quantitatively how careful assembly of marine metagenomic reads within, but also across, datasets can alleviate this problem. For 10 simulated datasets, each with species complexity modeled on a real counterpart, chimerism remained within the same species for most contigs (97%). For 42 real pyrosequencing ('454') datasets, assembly increased the proportion of annotated reads, and even more so when datasets were pooled, by on average 1.6% (max 6.6%) for species, 9.0% (max 28.7%) for Pfam protein domains and 9.4% (max 22.9%) for PANTHER gene families. Our results outline exciting prospects for data sharing in the metagenomics community. While chimeric sequences should be avoided in other areas of metagenomics (e.g. biodiversity analyses), conservative pooled assembly is advantageous for annotation specificity and sensitivity. Intriguingly, our experiment also found potential prospects for (low-cost) discovery of new species in 'old' data.

Cite

CITATION STYLE

APA

Magasin, J. D., & Gerloff, D. L. (2015). Pooled assembly of marine metagenomic datasets: Enriching annotation through chimerism. Bioinformatics, 31(3), 311–317. https://doi.org/10.1093/bioinformatics/btu546

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free