Background: Sequencing partial 16S rRNA genes is a cost effective method for quantifying the microbial composition of an environment, such as the human gut. However, downstream analysis relies on binning reads into microbial groups by either considering each unique sequence as a different microbe, querying a database to get taxonomic labels from sequences, or clustering similar sequences together. However, these approaches do not fully capture evolutionary relationships between microbes, limiting the ability to identify differentially abundant groups of microbes between a diseased and control cohort. We present sequence-based biomarkers (SBBs), an aggregation method that groups and aggregates microbes using single variants and combinations of variants within their 16S sequences. We compare SBBs against other existing aggregation methods (OTU clustering and Microphenoor DiTaxa features) in several benchmarking tasks: biomarker discovery via permutation test, biomarker discovery via linear discriminant analysis, and phenotype prediction power. We demonstrate the SBBs perform on-par or better than the state-of-the-art methods in biomarker discovery and phenotype prediction. Results: On two independent datasets, SBBs identify differentially abundant groups of microbes with similar or higher statistical significance than existing methods in both a permutation-test-based analysis and using linear discriminant analysis effect size. By grouping microbes by SBB, we can identify several differentially abundant microbial groups (FDR
CITATION STYLE
Chrisman, B. S., Paskov, K. M., Stockham, N., Jung, J. Y., Varma, M., Washington, P. Y., … Wall, D. P. (2021). Improved detection of disease-associated gut microbes using 16S sequence-based biomarkers. BMC Bioinformatics, 22(1). https://doi.org/10.1186/s12859-021-04427-7
Mendeley helps you to discover research relevant for your work.