The paradigm discovery problem

14Citations
Citations of this article
90Readers
Mendeley users who have this article in their library.

Abstract

This work treats the paradigm discovery problem (PDP)-the task of learning an inflectional morphological system from unannotated sentences. We formalize the PDP and develop evaluation metrics for judging systems. Using currently available resources, we construct datasets for the task. We also devise a heuristic benchmark for the PDP and report empirical results on five diverse languages. Our benchmark system first makes use of word embeddings and string similarity to cluster forms by cell and by paradigm. Then, we bootstrap a neural transducer on top of the clustered data to predict words to realize the empty paradigm slots. An error analysis of our system suggests clustering by cell across different inflection classes is the most pressing challenge for future work. Our code and data are available at https://github.com/alexerdmann/ParadigmDiscovery.

Cite

CITATION STYLE

APA

Erdmann, A., Elsner, M., Wu, S., Cotterell, R., & Habash, N. (2020). The paradigm discovery problem. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 7778–7790). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.acl-main.695

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free