On the discriminative power of credit scoring systems trained on independent samples

Miguel Biron; Cristián Bravo

Conference Proceedings

On the discriminative power of credit scoring systems trained on independent samples

Studies in Classification, Data Analysis, and Knowledge Organization (2014) 47 247-254

DOI: 10.1007/978-3-319-01595-8_27

2Citations

5Readers

Get full text

Abstract

The aim of this work is to assess the importance of independence assumption in behavioral scorings created using logistic regression. We develop four sampling methods that control which observations associated to each client are to be included in the training set, avoiding a functional dependence between observations of the same client. We then calibrate logistic regressions with variable selection on the samples created by each method, plus one using all the data in the training set (biased base method), and validate the models on an independent data set. We find that the regression built using all the observations shows the highest area under the ROC curve and Kolmogorv–Smirnov statistics, while the regression that uses the least amount of observations shows the lowest performance and highest variance of these indicators. Nevertheless, the fourth selection algorithm presented shows almost the same performance as the base method using just 14 % of the dataset, and 14 less variables. We conclude that violating the independence assumption does not impact strongly on results and, furthermore, trying to control it by using less data can harm the performance of calibrated models, although a better sampling method does lead to equivalent results with a far smaller dataset needed.

Cite

CITATION STYLE

APA

Biron, M., & Bravo, C. (2014). On the discriminative power of credit scoring systems trained on independent samples. In Studies in Classification, Data Analysis, and Knowledge Organization (Vol. 47, pp. 247–254). Kluwer Academic Publishers. https://doi.org/10.1007/978-3-319-01595-8_27

On the discriminative power of credit scoring systems trained on independent samples

Abstract

Cite

Register to see more suggestions