Hidden markov model-based korean part-of-speech tagging considering high agglutinativity, word-spacing, and lexical correlativity

Sang Zoo Lee; Jun Ichi Tsujii; Hae Chang Rim

Conference ProceedingsOPEN ACCESS

Hidden markov model-based korean part-of-speech tagging considering high agglutinativity, word-spacing, and lexical correlativity

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2000) 2000-October

DOI: 10.3115/1075218.1075266

8Citations

67Readers

Abstract

In this paper we present hidden Markov models for Korean part-of-speech tagging, which consider Korean characteristics such as high agglutinativity, word-spacing, and high lexical correlativity. In order ot consider rich information in contexts, the models adopt a less strict Markov assumption. In the models, sparse-data problem is very serious and their parameters tend to be estimated unreliably because they have a large number of parameters. To overcome sparse-data problem, our model uses a simplified version of the well-known back-off smoothing method. To mitigate unreliable estimation problem, our models assume joint independence instead of conditional independence because joint probabilities have the same degree of estimation reliability. Experimental results show that models with rich contexts perform even better than standard HMMs and that joint independent assumption is effective in some models.

Cite

CITATION STYLE

APA

Lee, S. Z., Tsujii, J. I., & Rim, H. C. (2000). Hidden markov model-based korean part-of-speech tagging considering high agglutinativity, word-spacing, and lexical correlativity. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 2000-October). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1075218.1075266

Hidden markov model-based korean part-of-speech tagging considering high agglutinativity, word-spacing, and lexical correlativity

Abstract

Cite

Register to see more suggestions