Analyzing Large Microbiome Datasets Using Machine Learning and Big Data

12Citations
Citations of this article
55Readers
Mendeley users who have this article in their library.

Abstract

Metagenomics promises to provide new valuable insights into the role of microbiomes in eukaryotic hosts such as humans. Due to the decreasing costs for sequencing, public and private repositories for human metagenomic datasets are growing fast. Metagenomic datasets can contain terabytes of raw data, which is a challenge for data processing but also an opportunity for advanced machine learning methods like deep learning that require large datasets. However, in contrast to classical machine learning algorithms, the use of deep learning in metagenomics is still an exception. Regardless of the algorithms used, they are usually not applied to raw data but require several preprocessing steps. Performing this preprocessing and the actual analysis in an automated, reproducible, and scalable way is another challenge. This and other challenges can be addressed by adjusting known big data methods and architectures to the needs of microbiome analysis and DNA sequence processing. A conceptual architecture for the use of machine learning and big data on metagenomic data sets was recently presented and initially validated to analyze the rumen microbiome. The same architecture can be used for clinical purposes as is discussed in this paper.

References Powered by Scopus

Random forests

95650Citations
N/AReaders
Get full text

Deep learning

64060Citations
N/AReaders
Get full text

Support-Vector Networks

46038Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Plant-Microbiota Interactions in Abiotic Stress Environments

49Citations
N/AReaders
Get full text

New-Generation Sequencing Technology in Diagnosis of Fungal Plant Pathogens: A Dream Comes True?

27Citations
N/AReaders
Get full text

Next-Generation Sequencing for the Detection of Microbial Agents in Avian Clinical Samples

6Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Krause, T., Wassan, J. T., Mc Kevitt, P., Wang, H., Zheng, H., & Hemmje, M. (2021). Analyzing Large Microbiome Datasets Using Machine Learning and Big Data. BioMedInformatics, 1(3), 138–165. https://doi.org/10.3390/biomedinformatics1030010

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 23

77%

Researcher 5

17%

Professor / Associate Prof. 1

3%

Lecturer / Post doc 1

3%

Readers' Discipline

Tooltip

Earth and Planetary Sciences 9

43%

Agricultural and Biological Sciences 5

24%

Computer Science 4

19%

Engineering 3

14%

Article Metrics

Tooltip
Social Media
Shares, Likes & Comments: 2

Save time finding and organizing research with Mendeley

Sign up for free