Empirical Analysis of Ensemble Learning for Imbalanced Credit Scoring Datasets: A Systematic Review

28Citations
Citations of this article
43Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Credit scoring analysis has gained tremendous importance for researchers and the financial industries around the globe. It helps the financial industries to grant credits or loans to each deserving applicant with zero or minimal risks. However, developing an accurate and effective credit scoring model is a challenging task due to class imbalance and the presence of some irrelevant features. Recent researches show that ensemble learning has achieved supremacy in this field. In this paper, we performed an extensive comparative analysis of ensemble algorithms to bring further improvements in the algorithm oversampling, and feature selection (FS) techniques are implemented. The relevant features are identified by utilizing three FS techniques, such as information gain (IG), principal component analysis (PCA), and genetic algorithm (GA). Additionally, a comparative performance analysis is performed using 5 base and 14 ensemble models on three credit scoring datasets. The experimental results exhibit that the GA-based FS technique and CatBoost algorithm perform significantly better than other models in terms of five metrics, i.e., accuracy (ACC), area under the curve (AUC), F1-score, Brier score (BS), and Kolmogorov-Smirnov (KS).

Cite

CITATION STYLE

APA

Lenka, S. R., Bisoy, S. K., Priyadarshini, R., & Sain, M. (2022). Empirical Analysis of Ensemble Learning for Imbalanced Credit Scoring Datasets: A Systematic Review. Wireless Communications and Mobile Computing. Hindawi Limited. https://doi.org/10.1155/2022/6584352

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free