Imbalanced data classification: A novel re-sampling approach combining versatile improved SMOTE and rough sets

22Citations
Citations of this article
77Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In recent years, the problem of learning from imbalanced data has emerged as important and challenging. The fact that one of the classes is underrepresented in the data set is not the only reason of difficulties. The complex distribution of data, especially small disjuncts, noise and class overlapping, contributes to the significant depletion of classifier’s performance. Hence, the numerous solutions were proposed. They are categorized into three groups: data-level techniques, algorithm-level methods and cost-sensitive approaches. This paper presents a novel datalevel method combining Versatile Improved SMOTE and rough sets. The algorithm was applied to the two-class problems, data sets were characterized by the nominal attributes. We evaluated the proposed technique in comparison with other preprocessing methods. The impact of the additional cleaning phase was specifically verified.

Cite

CITATION STYLE

APA

Borowska, K., & Stepaniuk, J. (2016). Imbalanced data classification: A novel re-sampling approach combining versatile improved SMOTE and rough sets. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9842 LNCS, pp. 31–42). Springer Verlag. https://doi.org/10.1007/978-3-319-45378-1_4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free