In this study, we investigate faecal microbiota composition, in an attempt to evaluate performance of classification algorithms in identifying Inflammatory Bowel Disease (IBD) and its two types: Crohn's disease (CD) and ulcerative colitis (UC). From many investigated algorithms, a random forest (RF) classifier was selected for detailed evaluation in three-class (CD versus UC versus nonIBD) classification task and two binary (nonIBD versus IBD and CD versus UC) classification tasks. We dealt with class imbalance, performed extensive parameter search, dimensionality reduction and two-level classification. In three-class classification, our best model reaches F1 score of 91% in average, which confirms the strong connection of IBD and gastrointestinal microbiome. Among most important features in three-class classification are species Staphylococcus hominis, Porphyromonas endodontalis, Slackia piriformis and genus Bacteroidetes.
CITATION STYLE
Mihajlović, A., Mladenović, K., Lončar-Turukalo, T., & Brdar, S. (2021). Machine learning based metagenomic prediction of inflammatory bowel disease. In Studies in Health Technology and Informatics (Vol. 285, pp. 165–170). IOS Press BV. https://doi.org/10.3233/SHTI210591
Mendeley helps you to discover research relevant for your work.