Strain-level metagenomic assignment and compositional estimation for long reads with MetaMaps

Alexander T. Dilthey; Chirag Jain; Sergey Koren; Adam M. Phillippy

Journal ArticleOPEN ACCESS

Strain-level metagenomic assignment and compositional estimation for long reads with MetaMaps

Nature Communications (2019) 10(1)

DOI: 10.1038/s41467-019-10934-2

100Citations

269Readers

Abstract

Metagenomic sequence classification should be fast, accurate and information-rich. Emerging long-read sequencing technologies promise to improve the balance between these factors but most existing methods were designed for short reads. MetaMaps is a new method, specifically developed for long reads, capable of mapping a long-read metagenome to a comprehensive RefSeq database with >12,000 genomes in <16 GB or RAM on a laptop computer. Integrating approximate mapping with probabilistic scoring and EM-based estimation of sample composition, MetaMaps achieves >94% accuracy for species-level read assignment and r2 > 0.97 for the estimation of sample composition on both simulated and real data when the sample genomes or close relatives are present in the classification database. To address novel species and genera, which are comparatively harder to predict, MetaMaps outputs mapping locations and qualities for all classified reads, enabling functional studies (e.g. gene presence/absence) and detection of incongruities between sample and reference genomes.

Cite

CITATION STYLE

APA

Dilthey, A. T., Jain, C., Koren, S., & Phillippy, A. M. (2019). Strain-level metagenomic assignment and compositional estimation for long reads with MetaMaps. Nature Communications, 10(1). https://doi.org/10.1038/s41467-019-10934-2

Strain-level metagenomic assignment and compositional estimation for long reads with MetaMaps

Abstract

Cite

Register to see more suggestions