An empirical study of bagging predictors for imbalanced data with different levels of class distribution

Guohua Liang; Xingquan Zhu; Chengqi Zhang

Conference Proceedings

An empirical study of bagging predictors for imbalanced data with different levels of class distribution

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 7106 LNAI 213-222

DOI: 10.1007/978-3-642-25832-9_22

12Citations

8Readers

Get full text

Abstract

Research into learning from imbalanced data has increasingly captured the attention of both academia and industry, especially when the class distribution is highly skewed. This paper compares the Area Under the Receiver Operating Characteristic Curve (AUC) performance of bagging in the context of learning from different imbalanced levels of class distribution. Despite the popularity of bagging in many real-world applications, some questions have not been clearly answered in the existing research, e.g., which bagging predictors may achieve the best performance for applications, and whether bagging is superior to single learners when the levels of class distribution change. We perform a comprehensive evaluation of the AUC performance of bagging predictors with 12 base learners at different imbalanced levels of class distribution by using a sampling technique on 14 imbalanced data-sets. Our experimental results indicate that Decision Table (DTable) and RepTree are the learning algorithms with the best bagging AUC performance. Most AUC performances of bagging predictors are statistically superior to single learners, except for Support Vector Machines (SVM) and Decision Stump (DStump). © 2011 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Liang, G., Zhu, X., & Zhang, C. (2011). An empirical study of bagging predictors for imbalanced data with different levels of class distribution. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7106 LNAI, pp. 213–222). https://doi.org/10.1007/978-3-642-25832-9_22

An empirical study of bagging predictors for imbalanced data with different levels of class distribution

Abstract

Author supplied keywords

Cite

Register to see more suggestions