Gene selection for cancer classification: a new hybrid filter-C5.0 approach for breast cancer risk prediction

2Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

Abstract

Despite the significant progress made in data mining technologies in recent years, breast cancer risk prediction and diagnosis at an early stage using DNA microarray technology still a real challenging task. This challenge comes especially from the high-dimensionality in gene expression data, i.e., an enormous number of genes versus a few tens of subjects (samples). To overcome this problem of data imbalance, a gene selection phase becomes a crucial step for gene expression data analysis. This study proposes a new Decision Tree model-based attributes (genes) selection strategy, which incorporates two stages: fisher-score-based filter technique and the gene selection ability of the C5.0 algorithm. Our proposed strategy is assessed using an ensemble of machine learning algorithms to classify each subject (patients). Comparing our approach with recent previous works, the experiment results demonstrate that our new gene selection strategy achieved the highest prediction performance of breast cancer by involving only five genes as predictors among 24481 genes.

Cite

CITATION STYLE

APA

Hamim, M., Moudden, I. E., Moutachaouik, H., & Hain, M. (2021). Gene selection for cancer classification: a new hybrid filter-C5.0 approach for breast cancer risk prediction. Advances in Science, Technology and Engineering Systems, 6(1), 871–878. https://doi.org/10.25046/aj060196

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free