A trigram statistical language model algorithm for chinese word segmentation

4Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We address the problem of segmenting a Chinese text into words. In this paper, we propose a trigram model algorithm for segmenting a Chinese text. We also discuss why statistical language model is appropriate to be applied to Chinese word segmentation and give an algorithm for segmenting a Chinese text into words. In particular, we solve the problem of searching which often leads to low performance brought by trigram model. Finally, the issue of OOV word identification is discussed and merged to trigram model based method in order to improve the accuracy of segmentation. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Mao, J., Cheng, G., He, Y., & Xing, Z. (2007). A trigram statistical language model algorithm for chinese word segmentation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4613 LNCS, pp. 271–280). Springer Verlag. https://doi.org/10.1007/978-3-540-73814-5_26

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free