Abstract
Statistical morphological inflectors are typically trained on fully supervised, type-level data. One remaining open research question is the following: How can we effectively exploit raw, token-level data to improve their performance? To this end, we introduce a novel generative latent-variable model for the semi-supervised learning of inflection generation. To enable posterior inference over the latent variables, we derive an efficient variational inference procedure based on the wake-sleep algorithm. We experiment on 23 languages, using the Universal Dependencies corpora in a simulated low-resource setting, and find improvements of over 10% absolute accuracy in some cases.
Cite
CITATION STYLE
Wolf-Sonkin, L., Naradowsky, J., Mielke, S. J., & Cotterell, R. (2018). A structured variational autoencoder for contextual morphological inflection. In ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) (Vol. 1, pp. 2631–2641). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p18-1245
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.