Phrase-based statistical model for Korean morpheme segmentation and POS tagging

Seung Hoon Na; Young Kil Kim

Journal ArticleOPEN ACCESS

Phrase-based statistical model for Korean morpheme segmentation and POS tagging

IEICE Transactions on Information and Systems (2018) E101D(2) 512-522

DOI: 10.1587/transinf.2017EDP7085

7Citations

6Readers

Abstract

In this paper, we propose a novel phrase-based model for Korean morphological analysis by considering a phrase as the basic processing unit, which generalizes all the other existing processing units. The impetus for using phrases this way is largely motivated by the success of phrase-based statistical machine translation (SMT), which convincingly shows that the larger the processing unit, the better the performance. Experimental results using the SEJONG dataset show that the proposed phrasebased models outperform the morpheme-based models used as baselines. In particular, when combined with the conditional random field (CRF) model, our model leads to statistically significant improvements over the state-of-the-art CRF method.

Author supplied keywords

Cite

CITATION STYLE

APA

Na, S. H., & Kim, Y. K. (2018). Phrase-based statistical model for Korean morpheme segmentation and POS tagging. IEICE Transactions on Information and Systems, E101D(2), 512–522. https://doi.org/10.1587/transinf.2017EDP7085

Phrase-based statistical model for Korean morpheme segmentation and POS tagging

Abstract

Author supplied keywords

Cite

Register to see more suggestions