Word segmentation for Burmese (Myanmar)

Chenchen Ding; Ye Kyaw Thu; Masao Utiyama; Eiichiro Sumita

Journal ArticleOPEN ACCESS

Word segmentation for Burmese (Myanmar)

ACM Transactions on Asian and Low-Resource Language Information Processing (2016) 15(4)

DOI: 10.1145/2846095

21Citations

9Readers

Abstract

Experiments on various word segmentation approaches for the Burmese language are conducted and discussed in this note. Specifically, dictionary-based, statistical, and machine learning approaches are tested. Experimental results demonstrate that statistical and machine learning approaches perform significantly better than dictionary-based approaches. We believe that this note, based on an annotated corpus of relatively considerable size (containing approximately a half million words), is the first systematic comparison of word segmentation approaches for Burmese. This work aims to discover the properties and proper approaches to Burmese textual processing and to promote further researches on this understudied language.

Author supplied keywords

Cite

CITATION STYLE

APA

Ding, C., Thu, Y. K., Utiyama, M., & Sumita, E. (2016). Word segmentation for Burmese (Myanmar). ACM Transactions on Asian and Low-Resource Language Information Processing, 15(4). https://doi.org/10.1145/2846095

Word segmentation for Burmese (Myanmar)

Abstract

Author supplied keywords

Cite

Register to see more suggestions