Part-of-speech tagging using word probability based on category patterns

Mi Young Kang; Sung Won Jung; Kyung Soon Park; Hyuk Chul Kwon

Conference Proceedings

Part-of-speech tagging using word probability based on category patterns

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2007) 4394 LNCS 119-130

DOI: 10.1007/978-3-540-70939-8_11

3Citations

2Readers

Get full text

Abstract

This paper focuses on part-of-speech (POS, category) tagging based on word probability estimated using morpheme unigrams and category patterns within a word. The word-N-gram-based POS-tagging model is difficult to adapt to agglutinative languages such as Korean, Turkish and Hungarian, among others, due to the high productivity of words. Thus, many of the stochastic studies on Korean POS-tagging have been conducted based on morpheme Ngrams. However, the morpheme-N-gram model also has difficulty coping with data sparseness when augmenting contextual information in order to assure sufficient performance. In addition, the model has difficulty conceiving the relationship of morphemes within a word. The present POS-tagging algorithm (a) resolves the data-sparseness problem thanks to a morpheme-unigram-based approach and (b) involves the relationship of morphemes within a word by estimating the weight of the category of a morpheme in a category pattern constituting a word. With the proposed model, a performance similar to that with other models that use more than just the morpheme-unigram model was observed. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Kang, M. Y., Jung, S. W., Park, K. S., & Kwon, H. C. (2007). Part-of-speech tagging using word probability based on category patterns. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4394 LNCS, pp. 119–130). Springer Verlag. https://doi.org/10.1007/978-3-540-70939-8_11

Part-of-speech tagging using word probability based on category patterns

Abstract

Cite

Register to see more suggestions