Hepatitis is a common worldwide public health problem that attacks almost every population in various countries. Machine learning has been widely used to classify various diseases, including hepatitis. In this research, the Random Forest algorithm will be used along with the dataset of patients with hepatitis to classify whether the patient's condition will live or die. Missing value and imbalance class exists in this dataset. In that class, the sample of healthy and sick patients that often occurs in the disease dataset. We replace missing values using mean and median and to deal with this imbalance of class, we use cost-sensitive methods to put penalty in classification. A manual selection feature process is also carried out to look for features that can be removed while still maintaining the quality of accuracy and classification. The validation method used is 10-fold Cross-Validation and using Random Forest Algorithm with tuned parameter to find the best result in classifying the class. This research prioritizes classification results by considering the small amount of data and the imbalance of the class, so it can classify the class more successfully and accurate for hepatitis patients. The accuracy value obtained is 85.80%.
CITATION STYLE
Nugroho, A. … Sfenrianto. (2020). Hepatitis Patient Classification using Random Forest Algorithms with Cost Sensitive Method. International Journal of Engineering and Advanced Technology, 9(3), 2528–2532. https://doi.org/10.35940/ijeat.c5903.029320
Mendeley helps you to discover research relevant for your work.