Medical imbalanced data classification

Sara Belarouci; Mohammed Amine Chikh

Journal ArticleOPEN ACCESS

Medical imbalanced data classification

Advances in Science, Technology and Engineering Systems (2017) 2(3) 116-124

DOI: 10.25046/aj020316

49Citations

67Readers

Abstract

In general, the imbalanced dataset is a problem often found in health applications. In medical data classification, we often face the imbalanced number of data samples where at least one of the classes constitutes only a very small minority of the data. In the same time, it represent a difficult problem in most of machine learning algorithms. There have been many works dealing with classification of imbalanced dataset. In this paper, we proposed a learning method based on a cost sensitive extension of Least Mean Square (LMS) algorithm that penalizes errors of different samples with different weights and some rules of thumb to determine those weights. After the balancing phase, we apply the different techniques (Support Vector Machine [SVM], K- Nearest Neighbor [K-NN] and Multilayer perceptron [MLP]) for the balanced datasets. We have also compared the obtained results before and after balancing method. We have obtained best results compared to literature with a classification accuracy of 100%.

Author supplied keywords

Cite

CITATION STYLE

APA

Belarouci, S., & Chikh, M. A. (2017). Medical imbalanced data classification. Advances in Science, Technology and Engineering Systems, 2(3), 116–124. https://doi.org/10.25046/aj020316

Medical imbalanced data classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions