An approach to imbalanced data classification based on instance selection and over-sampling

Ireneusz Czarnowski; Piotr Jędrzejowicz

Conference Proceedings

An approach to imbalanced data classification based on instance selection and over-sampling

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11683 LNAI 601-610

DOI: 10.1007/978-3-030-28377-3_50

3Citations

21Readers

Get full text

Abstract

The paper referees to a problem of learning from class-imbalanced data. The class imbalance problem arises when the number of instances from different classes differs substantially. Instance selection aims at deciding which instances from the training set should be retained and used during the learning process. Over-sampling is an approach dedicated to duplicate minority class instances. In the paper, a hybrid approach for the imbalanced data learning using the over-sampling and instance selection techniques is proposed. Instances are selected to reduce the number of instances belonging to the majority class, while the number of instances belonging to the minority class is expanded. The process of instance selection is based on clustering, where the authors’ approach to clustering and instance selection using an agent-based population learning algorithm is applied. As a result a more balanced distribution of instances belonging to different classes is obtained and a dataset size is reduced. The proposed approach is validated experimentally using several benchmark datasets.

Author supplied keywords

Cite

CITATION STYLE

APA

Czarnowski, I., & Jędrzejowicz, P. (2019). An approach to imbalanced data classification based on instance selection and over-sampling. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11683 LNAI, pp. 601–610). Springer Verlag. https://doi.org/10.1007/978-3-030-28377-3_50

An approach to imbalanced data classification based on instance selection and over-sampling

Abstract

Author supplied keywords

Cite

Register to see more suggestions