Over-sampling imbalanced datasets using the covariance matrix

1Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

INTRODUCTION: Nowadays, many machine learning tasks involve learning from imbalanced datasets, leading to the miss-classification of the minority class. One of the state-of-the-art approaches to "solve" this problem at the data level is Synthetic Minority Over-sampling Technique (SMOTE) which in turn uses KNearest Neighbors (KNN) algorithm to select and generate new instances. OBJECTIVES: This paper presents SMOTE-Cov, a modified SMOTE that use Covariance Matrix instead of KNN to balance datasets, with continuous attributes and binary class. METHODS: We implemented two variants SMOTE-CovI, which generates new values within the interval of each attribute and SMOTE-CovO, which allows some values to be outside the interval of the attributes. RESULTS: The results show that our approach has a similar performance as the state-of-the-art approaches. CONCLUSION: In this paper, a new algorithm is proposed to generate synthetic instances of the minority class, using the Covariance Matrix.

Cite

CITATION STYLE

APA

Leguen-de Varona, I., Madera, J., Martínez-López, Y., & Hernández-Nieto, J. C. (2020). Over-sampling imbalanced datasets using the covariance matrix. EAI Endorsed Transactions on Energy Web, 7(27). https://doi.org/10.4108/eai.13-7-2018.163982

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free