We address the problem of segmenting a Chinese text into words. In this paper, we propose a trigram model algorithm for segmenting a Chinese text. We also discuss why statistical language model is appropriate to be applied to Chinese word segmentation and give an algorithm for segmenting a Chinese text into words. In particular, we solve the problem of searching which often leads to low performance brought by trigram model. Finally, the issue of OOV word identification is discussed and merged to trigram model based method in order to improve the accuracy of segmentation. © Springer-Verlag Berlin Heidelberg 2007.
CITATION STYLE
Mao, J., Cheng, G., He, Y., & Xing, Z. (2007). A trigram statistical language model algorithm for chinese word segmentation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4613 LNCS, pp. 271–280). Springer Verlag. https://doi.org/10.1007/978-3-540-73814-5_26
Mendeley helps you to discover research relevant for your work.