Vietnamese part of speech tagging based on multi-category words disambiguation model

Zhao Chen; Liu Yanchao; Guo Jianyi; Chen Wei; Yan Xin; Yu Zhengtao; Chen Xiuqin

Conference Proceedings

Vietnamese part of speech tagging based on multi-category words disambiguation model

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 10619 LNAI 267-277

DOI: 10.1007/978-3-319-73618-1_23

0Citations

6Readers

Get full text

Abstract

POS tagging is a fundamental work in Natural Language Processing, which determines the subsequent processing quality, and the ambiguity of multi-category words directly affects the accuracy of Vietnamese POS tagging. At present, the POS tagging of English and Chinese has achieved better results, but the accuracy of Vietnamese POS tagging is still to be improved. For address this problem, this paper proposes a novel method of Vietnamese POS tagging based on multi-category words disambiguation model and Part of Speech dictionary, the multi-category words dictionary and the non-multi-category words dictionary are generated from the Vietnamese dictionary, which are used to build POS tagging corpus. 396,946 multi-category words have been extracted from the corpus, by using statistical method, the maximum entropy disambiguation model of Vietnamese part of speech is constructed, based on it, the multi-category words and the non-multi-category words are tagged. Experimental results show that the method proposed in the paper is higher than the existing model, which is proved that the method is feasible and effective.

Author supplied keywords

Cite

CITATION STYLE

APA

Chen, Z., Yanchao, L., Jianyi, G., Wei, C., Xin, Y., Zhengtao, Y., & Xiuqin, C. (2018). Vietnamese part of speech tagging based on multi-category words disambiguation model. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10619 LNAI, pp. 267–277). Springer Verlag. https://doi.org/10.1007/978-3-319-73618-1_23

Vietnamese part of speech tagging based on multi-category words disambiguation model

Abstract

Author supplied keywords

Cite

Register to see more suggestions