Abstract
Imbalanced classification is a challenging problem in the field of big data research and applications. Complex data distributions, such as small disjuncts and overlapping classes, make traditional methods unable to easily recognize the minority class and thus, lead to low sensitivity. The misclassification costs of the minority class are usually higher than that of the majority class. To deal with imbalanced datasets, typical algorithmic-level methods either introduce cost information or simply rebalance class distribution without considering the distribution of the minority class. In this paper, we propose an optimization embedded bagging (OEBag) approach to increase the sensitivity by learning the complex distributions in the minority class more precisely. By learning these base classifiers, OEBag selectively learns the minority examples that are misclassified easily by referring to examples in out-of-bag. OEBag is implemented by using two specialized under-sampling bagging methods. Nineteen real datasets with diverse levels of classification difficulties are utilized in this paper. Experimental results demonstrate that OEBag performs significantly better in sensitivity and has a great overall performance in terms of AUC (area under ROC curve) and G-mean when compared with several state-of-the-art methods.
Author supplied keywords
Cite
CITATION STYLE
Guan, H., Zhang, Y., Cheng, H., & Tang, X. (2018). A novel imbalanced classification method based on decision tree and bagging. International Journal of Performability Engineering, 14(6), 1140–1148. https://doi.org/10.23940/ijpe.18.06.p5.11401148
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.