Abstract
We present a model of morphological segmentation that jointly learns to segment and restore orthographic changes, e.g., funniest ⟼ fun-y-est. We term this form of analysis canonical segmentation and contrast it with the traditional surface segmentation, which segments a surface form into a sequence of substrings, e.g., funniest ⟼ funn-i-est. We derive an importance sampling algorithm for approximate inference in the model and report experimental results on English, German and Indonesian.
Cite
CITATION STYLE
Cotterell, R., Vieira, T., & Schütze, H. (2016). A joint model of orthography and morphological segmentation. In 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference (pp. 664–669). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/n16-1080
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.