This paper proposes a novel method integrating multi-level linguistic knowledge for Chinese grapheme-to-phoneme(G2P) conversion. Pronunciation prediction of non-standard words(NSWs) and disambiguation of polyphonic characters are two important issues in Chinese grapheme-to-phoneme conversion. Considering effect of linguistic knowledge, multi-level linguistic cues, including word form, Part-of-Speech (POS), named entity, collocation and syntactic structure, are extracted under a unified syntactic parsing framework and integrated by maximum entropy approach to disambiguate polyphonic characters. Besides, the text normalization is incorporated in this framework to help predict pronunciation of non-standard words. Experiment results show that the proposed method can improve the performance from 95.64% to 99.23%. © 2013 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Liu, Y., Chen, X., Gong, C., & Wu, X. (2013). Multi-level linguistic knowledge based Chinese grapheme-to-Phoneme conversion. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8261 LNCS, pp. 481–488). Springer Verlag. https://doi.org/10.1007/978-3-642-42057-3_61
Mendeley helps you to discover research relevant for your work.