Fast coupled sequence labeling on heterogeneous annotations via context-aware pruning

Zhenghua Li; Jiayuan Chao; Min Zhang; Jiwen Yang

Conference ProceedingsOPEN ACCESS

Fast coupled sequence labeling on heterogeneous annotations via context-aware pruning

EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings (2016) 753-762

DOI: 10.18653/v1/d16-1072

9Citations

84Readers

Abstract

The recently proposed coupled sequence labeling is shown to be able to effectively exploit multiple labeled data with heterogeneous annotations but suffer from severe inefficiency problem due to the large bundled tag space (Li et al., 2015). In their case study of part-of-speech (POS) tagging, Li et al. (2015) manually design context-free tag-to-tag mapping rules with a lot of effort to reduce the tag space. This paper proposes a context-aware pruning approach that performs token-wise constraints on the tag space based on contextual evidences, making the coupled approach efficient enough to be applied to the more complex task of joint word segmentation (WS) and POS tagging for the first time. Experiments show that using the large-scale People Daily as auxiliary heterogeneous data, the coupled approach can improve F-score by 95.55 - 94.88 = 0.67% on WS, and by 90.58 - 89.49 = 1.09% on joint WS&POS on Penn Chinese Treebank. All codes are released at http://hlt.suda.edu.cn/~zhli.

Cite

CITATION STYLE

APA

Li, Z., Chao, J., Zhang, M., & Yang, J. (2016). Fast coupled sequence labeling on heterogeneous annotations via context-aware pruning. In EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 753–762). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d16-1072

Fast coupled sequence labeling on heterogeneous annotations via context-aware pruning

Abstract

Cite

Register to see more suggestions