Optimizing classifiers for hypothetical scenarios

2Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The deployment of classification models is an integral component of many modern data mining and machine learning applications. A typical classification model is built with the tacit assumption that the deployment scenario by which it is evaluated is fixed and fully characterized. Yet, in the practical deployment of classification methods, important aspects of the application environment, such as the misclassification costs, may be uncertain during model building. Moreover, a single classification model may be applied in several different deployment scenarios. In this work, we propose a method to optimize a model for uncertain deployment scenarios. We begin by deriving a relationship between two evaluation measures, H measure and cost curves, that may be used to address uncertainty in classifier performance. We show that when uncertainty in classifier performance is modeled as a probabilistic belief that is a function of this underlying relationship, a natural definition of risk emerges for both classifiers and instances. We then leverage this notion of risk to develop a boosting-based algorithm—which we call RiskBoost— that directly mitigates classifier risk, and we demonstrate that it outperforms AdaBoost on a diverse selection of datasets.

Cite

CITATION STYLE

APA

Johnson, R. A., Raeder, T., & Chawla, N. V. (2015). Optimizing classifiers for hypothetical scenarios. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9077, pp. 264–276). Springer Verlag. https://doi.org/10.1007/978-3-319-18038-0_21

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free