One of the most significant problems in POS (Part-of-Speech) tagging of Chinese texts is an identification of words in a sentence, since there is no blank to delimit the words. Because it is impossible to pre-register all the words in a dictionary, the problem of unknown words inevitably occurs during this process. Therefore, the unknown word problem has remarkable effects on the accuracy of the sound in Chinese TTS (Text-to-Speech) system. In this paper, we present a SVM (support vector machine) based method that predicts the unknown words for the result of word segmentation and tagging. For high speed processing to be used in a TTS, we pre-detect the candidate boundary of the unknown words before starting actual prediction. Therefore we perform a two-phase unknown word prediction in the steps of detection and prediction. Results of the experiments are very promising by showing high precision and high recall with also high speed. © Springer-Verlag Berlin Heidelberg 2005.
CITATION STYLE
Ha, J., Zheng, Y., Kim, B., Lee, G. G., & Seong, Y. S. (2005). High speed unknown word prediction using support vector machine for Chinese text-to-speech systems. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 3248, pp. 509–517). Springer Verlag. https://doi.org/10.1007/978-3-540-30211-7_54
Mendeley helps you to discover research relevant for your work.