With the increasing interest in low-resource languages, unsupervised morphological segmentation has become an active area of research, where approaches based on Adaptor Grammars achieve state-of-the-art results. We demonstrate the power of harnessing linguistic knowledge as priors within Adaptor Grammars in a minimally-supervised learning fashion. We introduce two types of priors: 1) grammar definition, where we design language-specific grammars; and 2) linguist-provided affixes, collected by an expert in the language and seeded into the grammars. We use Japanese and Georgian as respective case studies for the two types of priors and introduce new datasets for these languages, with gold morphological segmentation for evaluation. We show that the use of priors results in error reductions of 8.9 % and 34.2 %, respectively, over the equivalent state-of-the-art unsupervised system.
CITATION STYLE
Eskander, R., Lowry, C., Khandagale, S., Callejas, F., Klavans, J., Polinsky, M., & Muresan, S. (2021). Minimally-Supervised Morphological Segmentation using Adaptor Grammars with Linguistic Priors. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 3969–3974). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-acl.347
Mendeley helps you to discover research relevant for your work.