Master Your Metrics with Calibration

21Citations
Citations of this article
53Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Machine learning models deployed in real-world applications are often evaluated with precision-based metrics such as F1-score or AUC-PR (Area Under the Curve of Precision Recall). Heavily dependent on the class prior, such metrics make it difficult to interpret the variation of a model’s performance over different subpopulations/subperiods in a dataset. In this paper, we propose a way to calibrate the metrics so that they can be made invariant to the prior. We conduct a large number of experiments on balanced and imbalanced data to assess the behavior of calibrated metrics and show that they improve interpretability and provide a better control over what is really measured. We describe specific real-world use-cases where calibration is beneficial such as, for instance, model monitoring in production, reporting, or fairness evaluation.

Cite

CITATION STYLE

APA

Siblini, W., Fréry, J., He-Guelton, L., Oblé, F., & Wang, Y. Q. (2020). Master Your Metrics with Calibration. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12080 LNCS, pp. 457–469). Springer. https://doi.org/10.1007/978-3-030-44584-3_36

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free