Crowdtruth: Machine-human computation framework for harnessing disagreement in gathering annotated data

Oana Inel; Khalid Khamkham; Tatiana Cristea; Anca Dumitrache; Arne Rutjes; Jelle van der Ploeg; Lukasz Romaszko; Lora Aroyo; Robert Jan Sips

Conference ProceedingsOPEN ACCESS

Crowdtruth: Machine-human computation framework for harnessing disagreement in gathering annotated data

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8797 486-504

DOI: 10.1007/978-3-319-11915-1_31

54Citations

74Readers

Abstract

In this paper we introduce the CrowdTruth open-source software framework for machine-human computation, that implements a novel approach to gathering human annotation data for a variety of media (e.g. text, image, video). The CrowdTruth approach embodied in the software captures human semantics through a pipeline of four processes: a) combining various machine processing of media in order to better understand the input content and optimize its suitability for micro-tasks, thus optimize the time and cost of the crowdsourcing process; b) providing reusable human-computing task templates to collect the maximum diversity in the human interpretation, thus collect richer human semantics; c) implementing ’disagreement metrics’, i.e. CrowdTruth metrics, to support deep analysis of the quality and semantics of the crowdsourcing data; and d) providing an interface to support data and results visualization. Instead of the traditional inter-annotator agreement, we use their disagreement as a useful signal to evaluate the data quality, ambiguity and vagueness. We demonstrate the applicability and robustness of this approach to a variety of problems across multiple domains. Moreover, we show the advantages of using open standards and the extensibility of the framework with new data modalities and annotation tasks.

Author supplied keywords

Cite

CITATION STYLE

APA

Inel, O., Khamkham, K., Cristea, T., Dumitrache, A., Rutjes, A., van der Ploeg, J., … Sips, R. J. (2014). Crowdtruth: Machine-human computation framework for harnessing disagreement in gathering annotated data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8797, pp. 486–504). Springer Verlag. https://doi.org/10.1007/978-3-319-11915-1_31

Crowdtruth: Machine-human computation framework for harnessing disagreement in gathering annotated data

Abstract

Author supplied keywords

Cite

Register to see more suggestions