Overview of the NLPCC-ICCPOL 2016 shared task: Chinese word segmentation for micro-blog texts

Xipeng Qiu; Peng Qian; Zhan Shi

Book Chapter

Overview of the NLPCC-ICCPOL 2016 shared task: Chinese word segmentation for micro-blog texts

Springer Verlag, (2016), 901-906

DOI: 10.1007/978-3-319-50496-4_84

16Citations

6Readers

Get full text

Abstract

In this paper, we give an overview for the shared task at the 5th CCF Conference on Natural Language Processing & Chinese Computing (NLPCC 2016): Chinese word segmentation for micro-blog texts. Different with the popular used newswire datasets, the dataset of this shared task consists of the relatively informal micro-texts. Besides, we also use a new psychometric-inspired evaluation metric for Chinese word segmentation, which addresses to balance the very skewed word distribution at different levels of difficulty. The data and evaluation codes can be downloaded from https://github.com/FudanNLP/ NLPCC-WordSeg-Weibo.

Cite

CITATION STYLE

APA

Qiu, X., Qian, P., & Shi, Z. (2016). Overview of the NLPCC-ICCPOL 2016 shared task: Chinese word segmentation for micro-blog texts. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10102, pp. 901–906). Springer Verlag. https://doi.org/10.1007/978-3-319-50496-4_84

Overview of the NLPCC-ICCPOL 2016 shared task: Chinese word segmentation for micro-blog texts

Abstract

Cite

Register to see more suggestions