Effects of data grouping on calibration measures of classifiers

Stephan Dreiseitl; Melanie Osl

Conference Proceedings

Effects of data grouping on calibration measures of classifiers

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 6927 LNCS(PART 1) 359-366

DOI: 10.1007/978-3-642-27549-4_46

1Citations

1Readers

Get full text

Abstract

The calibration of a probabilistic classifier refers to the extend to which its probability estimates match the true class membership probabilities. Measuring the calibration of a classifier usually relies on performing chi-squared goodness-of-fit tests between grouped probabilities and the observations in these groups. We considered alternatives to the Hosmer-Lemeshow test, the standard chi-squared test with groups based on sorted model outputs. Since this grouping does not represent "natural" groupings in data space, we investigated a chi-squared test with grouping strategies in data space. Using a series of artificial data sets for which the correct models are known, and one real-world data set, we analyzed the performance of the Pigeon-Heyse test with groupings by self-organizing maps, k-means clustering, and random assignment of points to groups. We observed that the Pigeon-Heyse test offers slightly better performance than the Hosmer-Lemeshow test while being able to locate regions of poor calibration in data space. © 2012 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Dreiseitl, S., & Osl, M. (2012). Effects of data grouping on calibration measures of classifiers. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6927 LNCS, pp. 359–366). https://doi.org/10.1007/978-3-642-27549-4_46

Effects of data grouping on calibration measures of classifiers

Abstract

Author supplied keywords

Cite

Register to see more suggestions