This paper presents a modified class-based LM approach to Chinese unknown word identification. In this work, Chinese unknown word identification is viewed as a classification problem and the part-of-speech of each unknown word is defined as its class. Furthermore, three types of features, including contextual class feature, word juncture model and word formation patterns, are combined in a framework of class-based LM to perform correct unknown word identification on a sequence of known words. In addition to unknown word identification, the class-based LM approach also provides a solution for unknown word tagging. The results of our experiments show that most unknown words in Chinese texts can be resolved effectively by the proposed approach. © Springer-Verlag Berlin Heidelberg 2005.
CITATION STYLE
Fu, G., & Luke, K. K. (2005). Chinese unknown word identification using class-based LM. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 3248, pp. 704–713). Springer Verlag. https://doi.org/10.1007/978-3-540-30211-7_74
Mendeley helps you to discover research relevant for your work.