In the dataset, any one of its classes is normally outnumbered by other classes and is known as class imbalance data. Many standard learning algorithms face the classification problem in performance due to imbalance data. The issues can be solved by many existing conventional methods such as cost-sensitive, sampling or ensemble methods. But these methods alter the original data distribution, which leads to loss of useful information of the users and it may cause unexpected errors or increase the problem of overfitting. In this research, local Mahalanobis distance learning (LMDL) method is applied in the nearest neighbor (NN) for improving the performance of the classification in the imbalance dataset. The multiple distance metrics are used in the LMDL to investigate the data effectively and obtain the relevant features based on the analysis. The distance metric uses the original data for learning the prototype and support the NN. A number of experiments on various datasets are conducted for validating the quality as well as the efficiency of the proposed LMDL method. The experimental results stated that the proposed LMDL achieved nearly 82% in E-coli dataset, 94% in breast cancer dataset and 98% in Iris dataset for all metrics such as accuracy, precision, recall and F-measure.
CITATION STYLE
Siddappa, N. G., & Kampalappa, T. (2020). Imbalance Data Classification Using Local Mahalanobis Distance Learning Based on Nearest Neighbor. SN Computer Science, 1(2). https://doi.org/10.1007/s42979-020-0085-x
Mendeley helps you to discover research relevant for your work.