Unsupervised disambiguation of syncretism in inflected lexicons

3Citations
Citations of this article
71Readers
Mendeley users who have this article in their library.

Abstract

Lexical ambiguity makes it difficult to compute various useful statistics of a corpus. A given word form might represent any of several morphological feature bundles. One can, however, use unsupervised learning (as in EM) to fit a model that probabilistically disambiguates word forms. We present such an approach, which employs a neural network to smoothly model a prior distribution over feature bundles (even rare ones). Although this basic model does not consider a token's context, that very property allows it to operate on a simple list of unigram type counts, partitioning each count among different analyses of that unigram. We discuss evaluation metrics for this novel task and report results on 5 languages.

Cite

CITATION STYLE

APA

Cotterell, R., Kirov, C., Mielke, S. J., & Eisner, J. (2018). Unsupervised disambiguation of syncretism in inflected lexicons. In NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference (Vol. 2, pp. 548–553). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/n18-2087

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free