Introduction to Imbalanced Data

Osamu Komori; Shinto Eguchi

Book Chapter

Introduction to Imbalanced Data

Komori O
Eguchi S

DOI: 10.1007/978-4-431-55570-4_1

N/ACitations

6Readers

Get full text

Abstract

An imbalance of sample sizes among class labels makes it difficult to obtain high classification accuracy in many scientific fields, including medical diagnosis, bioinformatics, biology, and fisheries management. This difficulty is referred to as ``class imbalance problem'' and is considered to be among the 10 most important problems in data mining research. This topic has also been widely discussed in several machine learning workshops. The critical feature of the imbalance problem is that it significantly degrades the performance of standard classification methods, which implicitly assume balanced class distributions and equal costs of misclassification for each class. Hence, new strategies are required for mitigating such imbalances, based on resampling techniques, modification of the classification algorithms, adjustment of weights for class distributions, and so on.

Cite

CITATION STYLE

APA

Komori, O., & Eguchi, S. (2019). Introduction to Imbalanced Data (pp. 1–10). https://doi.org/10.1007/978-4-431-55570-4_1

Introduction to Imbalanced Data

Abstract

Cite

Register to see more suggestions