Imbalanced datasets typically occur in many real applications. Resampling is one of the effective solutions due to producing a balanced class distribution. Synthetic Minority Over-sampling technique (SMOTE), an over-sampling technique is used in this study for dealing the imbalanced dataset by add the number of instances of a minority class. This technique is used to decrease the imbalance percentage of the dataset by generating new synthetic samples. Thus, a balanced training dataset is produced to replace the class imbalanced. The balanced datasets were obtained and trained with machine learning algorithms to diagnose the disease’s class. Through the experiment findings on the real-world datasets, oral cancer dataset and erythemato-squamous diseases dataset from the UCI machine learning datasets, an over-sampling method showed better results in clinical disease classification.
CITATION STYLE
Mohd, F., Abdul Jalil, M., Noora, N. M. M., Ismail, S., Yahya, W. F. F., & Mohamad, M. (2019). Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique. In Communications in Computer and Information Science (Vol. 1097 CCIS, pp. 99–110). Springer. https://doi.org/10.1007/978-3-030-36365-9_8
Mendeley helps you to discover research relevant for your work.