Classifier risk estimation under limited labeling resources

7Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Evaluating a trained system is an important component of machine learning. Labeling test data for large scale evaluation of a trained model can be extremely time consuming and expensive. In this paper we propose strategies for estimating performance of a classifier using as little labeling resource as possible. Specifically, we assume a labeling budget is given and the goal is to get a good estimate of the classifier performance using the provided labeling budget. We propose strategies to get a precise estimate of classifier accuracy under this restricted labeling budget scenario. We show that these strategies can reduce the variance in estimation of classifier accuracy by a significant amount compared to simple random sampling (over 65% in several cases). In terms of labeling resource, the reduction in number of samples required (compared to random sampling) to estimate the classifier accuracy with only 1 % error is high as 60% in some cases.

Cite

CITATION STYLE

APA

Kumar, A., & Raj, B. (2018). Classifier risk estimation under limited labeling resources. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10937 LNAI, pp. 3–15). Springer Verlag. https://doi.org/10.1007/978-3-319-93034-3_1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free