Abstract
In this paper, we introduce CroAno, a web-based crowd annotation platform for the Chinese named entity recognition (NER). Besides some basic features for crowd annotation like fast tagging and data management, CroAno provides a systematic solution for improving label consistency of Chinese NER dataset. 1) Disagreement Adjudicator: CroAno uses a multi-dimensional highlight mode to visualize instance-level inconsistent entities and makes the revision process user-friendly. 2) Inconsistency Detector: CroAno employs a detector to locate corpus-level label inconsistency and provides users an interface to correct inconsistent entities in batches. 3) Prediction Error Analyzer: We deconstruct the entity prediction error of the model to six fine-grained entity error types. Users can employ this error system to detect corpus-level inconsistency from a model perspective. To validate the effectiveness of our platform, we use CroAno to revise two public datasets. In the two revised datasets, we get an improvement of +1.96% and +2.57% F1 respectively in model performance.
Cite
CITATION STYLE
Zhang, B., Li, Z., Gan, Z., Chen, Y., Wan, J., Liu, K., … Shi, Y. (2021). CroAno: A Crowd Annotation Platform for Improving Label Consistency of Chinese NER Dataset. In EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (pp. 275–282). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.emnlp-demo.32
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.