CroAno: A Crowd Annotation Platform for Improving Label Consistency of Chinese NER Dataset

2Citations
Citations of this article
44Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we introduce CroAno, a web-based crowd annotation platform for the Chinese named entity recognition (NER). Besides some basic features for crowd annotation like fast tagging and data management, CroAno provides a systematic solution for improving label consistency of Chinese NER dataset. 1) Disagreement Adjudicator: CroAno uses a multi-dimensional highlight mode to visualize instance-level inconsistent entities and makes the revision process user-friendly. 2) Inconsistency Detector: CroAno employs a detector to locate corpus-level label inconsistency and provides users an interface to correct inconsistent entities in batches. 3) Prediction Error Analyzer: We deconstruct the entity prediction error of the model to six fine-grained entity error types. Users can employ this error system to detect corpus-level inconsistency from a model perspective. To validate the effectiveness of our platform, we use CroAno to revise two public datasets. In the two revised datasets, we get an improvement of +1.96% and +2.57% F1 respectively in model performance.

Cite

CITATION STYLE

APA

Zhang, B., Li, Z., Gan, Z., Chen, Y., Wan, J., Liu, K., … Shi, Y. (2021). CroAno: A Crowd Annotation Platform for Improving Label Consistency of Chinese NER Dataset. In EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (pp. 275–282). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.emnlp-demo.32

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free