Fast coupled sequence labeling on heterogeneous annotations via context-aware pruning

9Citations
Citations of this article
84Readers
Mendeley users who have this article in their library.

Abstract

The recently proposed coupled sequence labeling is shown to be able to effectively exploit multiple labeled data with heterogeneous annotations but suffer from severe inefficiency problem due to the large bundled tag space (Li et al., 2015). In their case study of part-of-speech (POS) tagging, Li et al. (2015) manually design context-free tag-to-tag mapping rules with a lot of effort to reduce the tag space. This paper proposes a context-aware pruning approach that performs token-wise constraints on the tag space based on contextual evidences, making the coupled approach efficient enough to be applied to the more complex task of joint word segmentation (WS) and POS tagging for the first time. Experiments show that using the large-scale People Daily as auxiliary heterogeneous data, the coupled approach can improve F-score by 95.55 - 94.88 = 0.67% on WS, and by 90.58 - 89.49 = 1.09% on joint WS&POS on Penn Chinese Treebank. All codes are released at http://hlt.suda.edu.cn/~zhli.

Cite

CITATION STYLE

APA

Li, Z., Chao, J., Zhang, M., & Yang, J. (2016). Fast coupled sequence labeling on heterogeneous annotations via context-aware pruning. In EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 753–762). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d16-1072

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free