Consumer Credit Risk Analysis is an important factor in financial institution as it would be possible to lend credit to only to consumers that have good credit. Besides using traditional methods such as credit scoring, machine learning can also be used as a tool to classify applications for credit. Many research papers had tested and concluded of the ability of machine learning methods to classify good and bad credit. From the previous papers it was concluded that Ensemble Method, Support Vector Machine and MLP (Neural Network) showed the best accuracy in credit risk. However those previous papers compared only a few algorithms so it is difficult to determine which algorithm has the best performance. Therefore, this paper will compare the algorithms from previous paper by applying the German dataset to the algorithms as it is the most common dataset used for credit risk analysis. The algorithms compared are Logistic Regression, Linear Discriminant Analysis, Support Vector Machine, Naïve Bayes, K-Nearest Neighbors, Decision Tree, Ensemble Methods (random forest, bagging, boosting) and MLP. The performance will be measured using accuracy, average accuracy from 10 fold cross validation, AUC, Sensitivity, Specificity, Precision rate, F1. This paper will focus to find the most accurate algorithm for credit risk analysis. From the result it was found out that Logistic Regression, Support Vector Machine and Bagging had a good result. However, since the scores were dispersed, it was difficult to conclude which algorithm had the best accuracy. At the end SVM was chosen as it was deemed to be the most accurate algorithm for credit risk analysis.
CITATION STYLE
Seo, J. Y. (2020). Machine Learning in Consumer Credit Risk Analysis: A Review. International Journal of Advanced Trends in Computer Science and Engineering, 9(4), 6440–6445. https://doi.org/10.30534/ijatcse/2020/328942020
Mendeley helps you to discover research relevant for your work.