On the Role of Cost-Sensitive Learning in Imbalanced Data Oversampling

0Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Learning from imbalanced data is still considered as one of the most challenging areas of machine learning. Among plethora of methods dedicated to alleviating the challenge of skewed distributions, two most distinct ones are data-level sampling and cost-sensitive learning. The former modifies the training set by either removing majority instances or generating additional minority ones. The latter associates a penalty cost with the minority class, in order to mitigate the classifiers’ bias towards the better represented class. While these two approaches have been extensively studied on their own, no works so far have tried to combine their properties. Such a direction seems as highly promising, as in many real-life imbalanced problems we may obtain the actual misclassification cost and thus it should be embedded in the classification framework, regardless of the selected algorithm. This work aims to open a new direction for learning from imbalanced data, by investigating an interplay between the oversampling and cost-sensitive approaches. We show that there is a direct relationship between the misclassification cost imposed on the minority class and the oversampling ratios that aim to balance both classes. This becomes vivid when popular skew-insensitive metrics are modified to incorporate the cost-sensitive element. Our experimental study clearly shows a strong relationship between sampling and cost, indicating that this new direction should be pursued in the future in order to develop new and effective algorithms for imbalanced data.

Cite

CITATION STYLE

APA

Krawczyk, B., & Wozniak, M. (2019). On the Role of Cost-Sensitive Learning in Imbalanced Data Oversampling. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11538 LNCS, pp. 180–191). Springer Verlag. https://doi.org/10.1007/978-3-030-22744-9_14

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free