Help Me to Help You

  • Wright D
  • Fortson L
  • Lintott C
  • et al.
N/ACitations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

The increasing size of datasets with which researchers in a variety of domains are confronted has led to a range of creative responses, including the deployment of modern machine learning techniques and the advent of large scale “citizen science projects.” However, the ability of the latter to provide suitably large training sets for the former is stretched as the size of the problem (and competition for attention amongst projects) grows. We explore the application of unsupervised learning to leverage structure that exists in an initially unlabelled dataset. We simulate grouping similar points before presenting those groups to volunteers to label. Citizen science labelling of grouped data is more efficient, and the gathered labels can be used to improve efficiency further for labelling future data.To demonstrate these ideas, we perform experiments using data from the Pan-STARRS Survey for Transients (PSST) with volunteer labels gathered by the Zooniverse project, Supernova Hunters and a simulated project using the MNIST handwritten digit dataset. Our results show that, in the best case, we might expect to reduce the required volunteer effort by 87.0% and 92.8% for the two datasets, respectively. These results illustrate a symbiotic relationship between machine learning and citizen scientists where each empowers the other with important implications for the design of citizen science projects in the future.

Cite

CITATION STYLE

APA

Wright, D. E., Fortson, L., Lintott, C., Laraia, M., & Walmsley, M. (2019). Help Me to Help You. ACM Transactions on Social Computing, 2(3), 1–20. https://doi.org/10.1145/3362741

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free