This paper proposes a method for automatic POS (part-of-speech) guessing of Chinese unknown words. It contains two models. The first model uses a machine-learning method to predict the POS of unknown words based on their internal component features. The credibility of the results of the first model is then measured. For low-credibility words, the second model is used to revise the first model's results based on the global context information of those words. The experiments show that the first model achieves 93.40% precision for all words and 86.60% for disyllabic words, which is a significant improvement over the best results reported in previous studies, which were 89% precision for all words and 74% for disyllabic words. Further, the second model improves the results by 0.80% precision for all words and 1.30% for disyllabic words. © 2008. Licensed under the Creative Commons.
CITATION STYLE
Qiu, L., Hu, C., & Zhao, K. (2008). A method for automatic POS guessing of Chinese unknown words. In Coling 2008 - 22nd International Conference on Computational Linguistics, Proceedings of the Conference (Vol. 1, pp. 705–712). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1599081.1599170
Mendeley helps you to discover research relevant for your work.