The accuracy of Random Forest performance can be improved by conducting a feature selection with a balancing strategy

Maria Irmina Prasetiyowati; Nur Ulfa Maulidevi; Kridanto Surendro

Journal ArticleOPEN ACCESS

The accuracy of Random Forest performance can be improved by conducting a feature selection with a balancing strategy

PeerJ Computer Science (2022) 8

DOI: 10.7717/PEERJ-CS.1041

17Citations

69Readers

Get full text

Abstract

One of the significant purposes of building a model is to increase its accuracy within a shorter timeframe through the feature selection process. It is carried out by determining the importance of available features in a dataset using Information Gain (IG). The process is used to calculate the amounts of information contained in features with high values selected to accelerate the performance of an algorithm. In selecting informative features, a threshold value (cut-off) is used by the Information Gain (IG). Therefore, this research aims to determine the time and accuracy-performance needed to improve feature selection by integrating IG, the Fast Fourier Transform (FFT), and Synthetic Minor Oversampling Technique (SMOTE) methods. The feature selection model is then applied to the Random Forest, a tree-based machine learning algorithm with random feature selection. A total of eight datasets consisting of three balanced and five imbalanced datasets were used to conduct this research. Furthermore, the SMOTE found in the imbalance dataset was used to balance the data. The result showed that the feature selection using Information Gain, FFT, and SMOTE improved the performance accuracy of Random Forest

Author supplied keywords

Cite

CITATION STYLE

APA

Prasetiyowati, M. I., Maulidevi, N. U., & Surendro, K. (2022). The accuracy of Random Forest performance can be improved by conducting a feature selection with a balancing strategy. PeerJ Computer Science, 8. https://doi.org/10.7717/PEERJ-CS.1041

The accuracy of Random Forest performance can be improved by conducting a feature selection with a balancing strategy

Abstract

Author supplied keywords

Cite

Register to see more suggestions