Quality control attack schemes in crowdsourcing

5Citations
Citations of this article
25Readers
Mendeley users who have this article in their library.
Get full text

Abstract

An important precondition to build effective AI models is the collection of training data at scale. Crowdsourcing is a popular methodology to achieve this goal. Its adoption introduces novel challenges in data quality control, to deal with under-performing and malicious annotators. One of the most popular quality assurance mechanisms, especially in paid micro-task crowdsourcing, is the use of a small set of pre-annotated tasks as gold standard, to assess in real time the annotators quality. In this paper, we highlight a set of vulnerabilities this scheme suffers: a group of colluding crowd workers can easily implement and deploy a decentralised machine learning inferential system to detect and signal which parts of the task are more likely to be gold questions, making them ineffective as a quality control tool. Moreover, we demonstrate how the most common countermeasures against this attack are ineffective in practical scenarios. The basic architecture of the inferential system is composed of a browser plug-in and an external server where the colluding workers can share information. We implement and validate the attack scheme, by means of experiments on real-world data from a popular crowdsourcing platform.

Cite

CITATION STYLE

APA

Checco, A., Bates, J., & Demartini, G. (2019). Quality control attack schemes in crowdsourcing. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2019-August, pp. 6136–6140). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2019/850

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free