A novel label aggregation with attenuated scores for ground-Truth identification of dataset annotation with crowdsourcing

Ratchainant Thammasudjarit; Anon Plangprasopchok; Charnyote Pluempitiwiriyawej

Journal ArticleOPEN ACCESS

A novel label aggregation with attenuated scores for ground-Truth identification of dataset annotation with crowdsourcing

IEICE Transactions on Information and Systems (2017) E100D(4) 750-757

DOI: 10.1587/transinf.2016DAP0024

0Citations

7Readers

Abstract

Ground-Truth identification -The process, which infers the most probable labels, for a certain dataset, from crowdsourcing annotations - is a crucial task to make the dataset usable, e.g., for a supervised learning problem. Nevertheless, the process is challenging because annotations from multiple annotators are inconsistent and noisy. Existing methods require a set of data sample with corresponding ground-Truth labels to precisely estimate annotator performance but such samples are difficult to obtain in practice. Moreover, the process requires a post-editing step to validate indefinite labels, which are generally unidentifiable without thoroughly inspecting the whole annotated data. To address the challenges, this paper introduces: 1) Attenuated score (A-score) -An indicator that locally measures annotator performance for segments of annotation sequences, and 2) label aggregation method that applies A-score for ground-Truth identification. The experimental results demonstrate that A-score label aggregation outperforms majority vote in all datasets by accurately recovering more labels. It also achieves higher F1 scores than those of the strong baselines in all multi-class data. Additionally, the results suggest that A-score is a promising indicator that helps identifying indefinite labels for the postediting procedure.

Author supplied keywords

Cite

CITATION STYLE

APA

Thammasudjarit, R., Plangprasopchok, A., & Pluempitiwiriyawej, C. (2017). A novel label aggregation with attenuated scores for ground-Truth identification of dataset annotation with crowdsourcing. IEICE Transactions on Information and Systems, E100D(4), 750–757. https://doi.org/10.1587/transinf.2016DAP0024

A novel label aggregation with attenuated scores for ground-Truth identification of dataset annotation with crowdsourcing

Abstract

Author supplied keywords

Cite

Register to see more suggestions