MTR: Taxonomic annotation of short metagenomic reads using clustering at multiple taxonomic ranks

49Citations
Citations of this article
170Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Motivation: Metagenomics is a recent field of biology that studies microbial communities by analyzing their genomic content directly sequenced from the environment. A metagenomic dataset consists of many short DNA or RNA fragments called reads. One interesting problem in metagenomic data analysis is the discovery of the taxonomic composition of a given dataset. A simple method for this task, called the Lowest Common Ancestor (LCA), is employed in state-of-the-art computational tools for metagenomic data analysis of very short reads (about 100 bp). However LCA has two main drawbacks: it possibly assigns many reads to high taxonomic ranks and it discards a high number of reads. Results: We present MTR, a new method for tackling these drawbacks using clustering at Multiple Taxonomic Ranks. Unlike LCA, which processes the reads one-by-one, MTR exploits information shared by reads. Specifically, MTR consists of two main phases. First, for each taxonomic rank, a collection of potential clusters of reads is generated, and each potential cluster is associated to a taxon at that rank. Next, a small number of clusters is selected at each rank using a combinatorial optimization algorithm. The effectiveness of the resulting method is tested on a large number of simulated and real-life metagenomes. Results of experiments show that MTR improves on LCA by discarding a significantly smaller number of reads and by assigning much more reads at lower taxonomic ranks. Moreover, MTR provides a more faithful taxonomic characterization of the metagenome population distribution. © The Author(s) 2010. Published by Oxford University Press.

Cite

CITATION STYLE

APA

Gori, F., Folino, G., Jetten, M. S. M., & Marchiori, E. (2011). MTR: Taxonomic annotation of short metagenomic reads using clustering at multiple taxonomic ranks. Bioinformatics, 27(2), 196–203. https://doi.org/10.1093/bioinformatics/btq649

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free