Recovering true classifier performance in positive-unlabeled learning

Shantanu Jain; Martha White; Predrag Radivojac

Conference ProceedingsOPEN ACCESS

Recovering true classifier performance in positive-unlabeled learning

31st AAAI Conference on Artificial Intelligence, AAAI 2017 (2017) 2066-2072

DOI: 10.1609/aaai.v31i1.10937

33Citations

87Readers

Abstract

A common approach in positive-unlabeled learning is to train a classification model between labeled and unlabeled data. This strategy is in fact known to give an optimal classifier under mild conditions; however, it results in biased empirical estimates of the classifier performance. In this work, we show that the typically used performance measures such as the receiver operating characteristic curve, or the precisionrecall curve obtained on such data can be corrected with the knowledge of class priors; i.e., the proportions of the positive and negative examples in the unlabeled data. We extend the results to a noisy setting where some of the examples labeled positive are in fact negative and show that the correction also requires the knowledge of the proportion of noisy examples in the labeled positives. Using state-of-the-art algorithms to estimate the positive class prior and the proportion of noise, we experimentally evaluate two correction approaches and demonstrate their efficacy on real-life data.

Cite

CITATION STYLE

APA

Jain, S., White, M., & Radivojac, P. (2017). Recovering true classifier performance in positive-unlabeled learning. In 31st AAAI Conference on Artificial Intelligence, AAAI 2017 (pp. 2066–2072). AAAI press. https://doi.org/10.1609/aaai.v31i1.10937

Recovering true classifier performance in positive-unlabeled learning

Abstract

Cite

Register to see more suggestions