Abstract
Credit scoring analysis has gained tremendous importance for researchers and the financial industries around the globe. It helps the financial industries to grant credits or loans to each deserving applicant with zero or minimal risks. However, developing an accurate and effective credit scoring model is a challenging task due to class imbalance and the presence of some irrelevant features. Recent researches show that ensemble learning has achieved supremacy in this field. In this paper, we performed an extensive comparative analysis of ensemble algorithms to bring further improvements in the algorithm oversampling, and feature selection (FS) techniques are implemented. The relevant features are identified by utilizing three FS techniques, such as information gain (IG), principal component analysis (PCA), and genetic algorithm (GA). Additionally, a comparative performance analysis is performed using 5 base and 14 ensemble models on three credit scoring datasets. The experimental results exhibit that the GA-based FS technique and CatBoost algorithm perform significantly better than other models in terms of five metrics, i.e., accuracy (ACC), area under the curve (AUC), F1-score, Brier score (BS), and Kolmogorov-Smirnov (KS).
Cite
CITATION STYLE
Lenka, S. R., Bisoy, S. K., Priyadarshini, R., & Sain, M. (2022). Empirical Analysis of Ensemble Learning for Imbalanced Credit Scoring Datasets: A Systematic Review. Wireless Communications and Mobile Computing. Hindawi Limited. https://doi.org/10.1155/2022/6584352
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.