Machine learning model for lymph node metastasis prediction in breast cancer using random forest algorithm and mitochondrial metabolism hub genes

7Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.

Abstract

Breast cancer metastasis can have a fatal outcome, with the prediction of metastasis being critical for establishing effective treatment strategies. RNA-sequencing (RNA-seq) is a good tool for identifying genes that promote and support metastasis development. The hub gene analysis method is a bioinformatics method that can effectively analyze RNA sequencing results. This can be used to specify the set of genes most relevant to the function of the cell involved in metastasis. Herein, a new machine learning model based on RNA-seq data using the random forest algorithm and hub genes to estimate the accuracy of breast cancer metastasis prediction. Single-cell breast cancer samples (56 metastatic and 38 non-metastatic samples) were obtained from the Gene Expression Omnibus database, and the Weighted Gene Correlation Network Analysis package was used for the selection of gene modules and hub genes (function in mitochondrial metabolism). A machine learning prediction model using the hub gene set was devised and its accuracy was evaluated. A prediction model comprising 54-functional-gene modules and the hub gene set (NDUFA9, NDUFB5, and NDUFB3) showed an accuracy of 0.769 ± 0.02, 0.782 ± 0.012, and 0.945 ± 0.016, respectively. The test accuracy of the hub gene set was over 93% and that of the prediction model with random forest and hub genes was over 91%. A breast cancer metastasis dataset from The Cancer Genome Atlas was used for external validation, showing an accuracy of over 91%. The hub gene assay can be used to predict breast cancer metastasis by machine learning.

References Powered by Scopus

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

54481Citations
N/AReaders
Get full text

STAR: Ultrafast universal RNA-seq aligner

29792Citations
N/AReaders
Get full text

WGCNA: An R package for weighted correlation network analysis

16346Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Identification of Novel Diagnostic and Prognostic Gene Signature Biomarkers for Breast Cancer Using Artificial Intelligence and Machine Learning Assisted Transcriptomics Analysis

15Citations
N/AReaders
Get full text

Refining breast cancer biomarker discovery and drug targeting through an advanced data-driven approach

7Citations
N/AReaders
Get full text

Review of Intelligent Algorithms for Breast Cancer Control: a Latin America Perspective

4Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Kim, B. C., Kim, J., Lim, I., Kim, D. H., Lim, S. M., & Woo, S. K. (2021). Machine learning model for lymph node metastasis prediction in breast cancer using random forest algorithm and mitochondrial metabolism hub genes. Applied Sciences (Switzerland), 11(7). https://doi.org/10.3390/app11072897

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 4

50%

Professor / Associate Prof. 2

25%

Researcher 2

25%

Readers' Discipline

Tooltip

Computer Science 4

50%

Engineering 2

25%

Neuroscience 1

13%

Medicine and Dentistry 1

13%

Article Metrics

Tooltip
Mentions
Blog Mentions: 1

Save time finding and organizing research with Mendeley

Sign up for free