Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique

Fatihah Mohd; Masita Abdul Jalil; Noor Maizura Mohamad Noora; Suryani Ismail; Wan Fatin Fatihah Yahya; Mumtazimah Mohamad

Conference Proceedings

Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique

Communications in Computer and Information Science (2019) 1097 CCIS 99-110

DOI: 10.1007/978-3-030-36365-9_8

3Citations

11Readers

Get full text

Abstract

Imbalanced datasets typically occur in many real applications. Resampling is one of the effective solutions due to producing a balanced class distribution. Synthetic Minority Over-sampling technique (SMOTE), an over-sampling technique is used in this study for dealing the imbalanced dataset by add the number of instances of a minority class. This technique is used to decrease the imbalance percentage of the dataset by generating new synthetic samples. Thus, a balanced training dataset is produced to replace the class imbalanced. The balanced datasets were obtained and trained with machine learning algorithms to diagnose the disease’s class. Through the experiment findings on the real-world datasets, oral cancer dataset and erythemato-squamous diseases dataset from the UCI machine learning datasets, an over-sampling method showed better results in clinical disease classification.

Author supplied keywords

Cite

CITATION STYLE

APA

Mohd, F., Abdul Jalil, M., Noora, N. M. M., Ismail, S., Yahya, W. F. F., & Mohamad, M. (2019). Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique. In Communications in Computer and Information Science (Vol. 1097 CCIS, pp. 99–110). Springer. https://doi.org/10.1007/978-3-030-36365-9_8

Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique

Abstract

Author supplied keywords

Cite

Register to see more suggestions