Crowdsourcing with Contextual Uncertainty

Viet An Nguyen; Peibei Shi; Jagdish Ramakrishnan; Narjes Torabi; Nimar S. Arora; Udi Weinsberg; Michael Tingley

Conference ProceedingsOPEN ACCESS

Crowdsourcing with Contextual Uncertainty

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2022) 3645-3655

DOI: 10.1145/3534678.3539184

5Citations

4Readers

Get full text

Abstract

We study a crowdsourcing setting where we need to infer the latent truth about a task given observed labels together with context in the form of a classifier score. We present Theodon, a hierarchical non-parametric Bayesian model, developed and deployed at Meta, that captures both the prevalence of label categories and the accuracy of labelers as functions of the classifier score. Theodon uses Gaussian processes to model the non-uniformity of mistakes over the range of classifier scores. For our experiments, we used data generated from integrity applications at Meta as well as public datasets. We showed that Theodon (1) obtains 1-4% improvement in AUC-PR predictions on items' true labels compared to state-of-the-art baselines for public datasets, (2) is effective as a calibration method, and (3) provides detailed insights on labelers' performances.

Author supplied keywords

Cite

CITATION STYLE

APA

Nguyen, V. A., Shi, P., Ramakrishnan, J., Torabi, N., Arora, N. S., Weinsberg, U., & Tingley, M. (2022). Crowdsourcing with Contextual Uncertainty. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 3645–3655). Association for Computing Machinery. https://doi.org/10.1145/3534678.3539184

Crowdsourcing with Contextual Uncertainty

Abstract

Author supplied keywords

Cite

Register to see more suggestions