This paper describes our system for Chinese word segmentation of micro-blog text, one of the NLPCC-ICCPOL 2016 Shared Tasks [1]. The CRF (Conditional Random Field) model is employed to model word segmentation as a sequence labeling problem, 7 sets of features are selected to train the CRF model. The system achieves fb 0.798144 on closed track, 0.81968 on semi-open track, and 0.82217 on open track with weighted measures [2].
CITATION STYLE
Leng, Y., Liu, W., Wang, S., & Wang, X. (2016). A feature-rich CRF segmenter for chinese micro-blog. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10102, pp. 854–861). Springer Verlag. https://doi.org/10.1007/978-3-319-50496-4_78
Mendeley helps you to discover research relevant for your work.