Financial institutions are faced with the need to assess the creditworthiness of a borrower that applies for a loan. In this regard, data scientistscan produce valuable insights that can explain customer profile and behavior. This paper proposes an analysis of a database of customers where a part of them were unable to repay their loans and got into default status. By using the methodology of data mining and machine learning algorithms, a series of predictive models were developedusing classifiers such as LightGBM, XGBoost, Logistic Regression and Random Forest in order to evaluate the probability of a customer’s enteringloan default. Three sampling scenarios were created to compare the classification between imbalanced and balanced data sets. Moreover, a model comparison analysis was performed to identify the best classifier by considering the model performance metrics: AUC score, Precision, Recall and Accuracy. The best results were observed for the Random Forest optimal classifier applied on the combined scenario under-over sampling, with a representative AUC of 0.89.
CITATION STYLE
Coşer, A., Maer-Matei, M. M., & Albu, C. (2019). Predictive models for loan default risk assessment. Economic Computation and Economic Cybernetics Studies and Research, 53(2), 149–165. https://doi.org/10.24818/18423264/53.2.19.09
Mendeley helps you to discover research relevant for your work.