Instance hardness and multivariate Gaussian distribution-based oversampling technique for imbalance classification

6Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Imbalance classification has received great attention due to its various real-world applications. Data-level approaches are the most convenient to address data imbalance, whereas oversampling is the most deeply explored. However, most previous studies used distance-based factors to select minority class instances for oversampling. Thus, the synthetic instances often did not follow the distribution of the original minority class instances. In this work, we propose a novel oversampling method based on instance hardness and multivariate Gaussian distribution. First, a fused feature set including k-disagree value and classification error is used for selecting and weighting minority class instances for oversampling. Here, the k-disagree value is also used to filter majority class instances. Then, multivariate Gaussian distribution is fitted to the subset of selected minority class instances, where the selection of subset is based on closest- and cluster-based methods. Next, new instances are generated based on the subset distribution. Finally, Euclidean distance-based instance selection is investigated for improved imbalance classification performance. Experimental results on the KEEL dataset repository show that our proposed method can outperform the other compared oversamplers in terms of both AUC and G-mean.

Cite

CITATION STYLE

APA

Xie, J., Zhu, M., Hu, K., & Zhang, J. (2023). Instance hardness and multivariate Gaussian distribution-based oversampling technique for imbalance classification. Pattern Analysis and Applications, 26(2), 735–749. https://doi.org/10.1007/s10044-022-01129-5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free