Overlap-Based Undersampling for Improving Imbalanced Data Classification

Pattaramon Vuttipittayamongkol; Eyad Elyan; Andrei Petrovski; Chrisina Jayne

Conference Proceedings

Overlap-Based Undersampling for Improving Imbalanced Data Classification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11314 LNCS 689-697

DOI: 10.1007/978-3-030-03493-1_72

45Citations

23Readers

Get full text

Abstract

Classification of imbalanced data remains an important field in machine learning. Several methods have been proposed to address the class imbalance problem including data resampling, adaptive learning and cost adjusting algorithms. Data resampling methods are widely used due to their simplicity and flexibility. Most existing resampling techniques aim at rebalancing class distribution. However, class imbalance is not the only factor that impacts the performance of the learning algorithm. Class overlap has proved to have a higher impact on the classification of imbalanced datasets than the dominance of the negative class. In this paper, we propose a new undersampling method that eliminates negative instances from the overlapping region and hence improves the visibility of the minority instances. Testing and evaluating the proposed method using 36 public imbalanced datasets showed statistically significant improvements in classification performance.

Author supplied keywords

Cite

CITATION STYLE

APA

Vuttipittayamongkol, P., Elyan, E., Petrovski, A., & Jayne, C. (2018). Overlap-Based Undersampling for Improving Imbalanced Data Classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11314 LNCS, pp. 689–697). Springer Verlag. https://doi.org/10.1007/978-3-030-03493-1_72

Overlap-Based Undersampling for Improving Imbalanced Data Classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions