Chinese pinyin-text conversion on segmented text

1Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Most current research and applications on Pinyin to Chinese word conversion employs a hidden Markov model (HMMs) which in turn uses a character-based language model. The reason is because Chinese texts are written without word boundaries. However in some tasks that involve the Pinyin to Chinese conversion, such as Chinese text proofreading, the original Chinese text is known. This enables us to extract the words and a word-based language model can be developed. In this paper we compare the two models and come to a conclusion that using word-based bi-gram language model achieve higher conversion accuracy than character-based bi-gram language model. © 2009 Springer Berlin Heidelberg.

Cite

CITATION STYLE

APA

Liu, W., & Guthrie, L. (2009). Chinese pinyin-text conversion on segmented text. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5729 LNAI, pp. 116–123). https://doi.org/10.1007/978-3-642-04208-9_19

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free