A concurrent subtractive assembly approach for identification of disease associated sub-metagenomes

2Citations
Citations of this article
24Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Comparative analysis of metagenomes can be used to detect sub-metagenomes (species or gene sets) that are associated with specific phenotypes (e.g., host status). The typical workflow is to assemble and annotate metagenomic datasets individually or as a whole, followed by statistical tests to identify differentially abundant species/genes. We previously developed subtractive assembly (SA), a de novo assembly approach for comparative metagenomics that first detects differential reads that distinguish between two groups of metagenomes and then only assembles these reads. Application of SA to type 2 diabetes (T2D) microbiomes revealed new microbial genes associated with T2D. Here we further developed a Concurrent Subtractive Assembly (CoSA) approach, which uses a Wilcoxon rank-sum (WRS) test to detect k-mers that are differentially abundant between two groups of microbiomes (by contrast, SA only checks ratios of k-mer counts in one pooled sample versus the other). It then uses identified differential k-mers to extract reads that are likely sequenced from the sub-metagenome with consistent abundance differences between the groups of microbiomes. Further, CoSA attempts to reduce the redundancy of reads (from abundant common species) by excluding reads containing abundant k-mers. Using simulated microbiome datasets and T2D datasets, we show that CoSA achieves strikingly better performance in detecting consistent changes than SA does, and it enables the detection and assembly of genomes and genes with minor abundance difference. A SVM classifier built upon the microbial genes detected by CoSA from the T2D datasets can accurately discriminates patients from healthy controls, with an AUC of 0.94 (10-fold cross-validation), and therefore these differential genes (207 genes) may serve as potential microbial marker genes for T2D.

Cite

CITATION STYLE

APA

Han, W., Wang, M., & Ye, Y. (2017). A concurrent subtractive assembly approach for identification of disease associated sub-metagenomes. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10229 LNCS, pp. 18–33). Springer Verlag. https://doi.org/10.1007/978-3-319-56970-3_2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free