GMeta: A Novel Algorithm to Utilize Highly Connected Components for Metagenomic Binning

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Metagenomic binning refers to the means of clustering or assigning taxonomy to metagenomic sequences or contigs. Due to the massive abundance of organisms in metagenomic samples, the number of nucleotide sequences skyrockets, and thus leading to the complexity of binning algorithms. Unsupervised classification is gaining a reputation in recent years since the lacking of the reference database required in the reference-based methods with various state-of-the-art tools released. By manipulating the overlapping information between reads drives to the success of various unsupervised methods with extraordinary accuracy. These research practices on the evidence that the average proportion of common l-mers between genomes of different species is practically miniature when l is sufficient. This paper introduces a novel algorithm for binning metagenomic sequences without requiring reference databases by utilizing highly connected components inside a weighted overlapping graph of reads. Experimental outcomes show that the precision is improved over other well-known binning tools for both short and long sequences.

Cite

CITATION STYLE

APA

Pham, H. T., Vinh, L. V., Lang, T. V., & Tran, V. H. (2019). GMeta: A Novel Algorithm to Utilize Highly Connected Components for Metagenomic Binning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11814 LNCS, pp. 545–559). Springer. https://doi.org/10.1007/978-3-030-35653-8_35

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free