Smote-cov: A new oversampling method based on the covariance matrix

3Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Nowadays, many machine learning tasks involve learning from imbalanced datasets, leading to misclassification of the minority class. One of the state-of-the-art approaches to “solve” this problem at the data level is the Synthetic Minority Oversampling Technique (SMOTE), which in turn uses KNN to select and generate new instances. However, those approaches do not take into account the attributes’ dependency relationship. This chapter presents SMOTE-Cov, a modified SMOTE that uses the Covariance Matrix instead of KNN to balance datasets, with continuous attributes and binary class. We implemented two variants SMOTE-CovI, which generates new values within the interval of each attribute, and SMOTE-CovO, which allows some values to be outside the interval of the attributes. SMOTE-Cov was validated by means of an experimental study using C4.5 as a classifier. The results show that our approach displays similar performance to the state-of-the-art approaches. After applying the statistical tests of Friedman and Holm, we did not find any big significant difference.

Cite

CITATION STYLE

APA

Leguen-deVarona, I., Madera, J., Martínez-López, Y., & Hernández-Nieto, J. C. (2020). Smote-cov: A new oversampling method based on the covariance matrix. In EAI/Springer Innovations in Communication and Computing (pp. 207–215). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-48149-0_15

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free