Finding Sami Cognates with a Character-Based NMT Approach

Mika Hämäläinen; Jack Reuter

Journal ArticleOPEN ACCESS

Finding Sami Cognates with a Character-Based NMT Approach

Hämäläinen M
Reuter J

Proceedings of the Workshop on Computational Methods for Endangered Languages (2019)

DOI: 10.33011/computel.v1i.395

N/ACitations

61Readers

Abstract

We approach the problem of expanding the set of cognate relations with a sequence-to-sequence NMT model. The language pair of interest, Skolt Sami and North Sami, has too limited a set of parallel data for an NMT model as such. We solve this problem on the one hand, by training the model with North Sami cognates with other Uralic languages and, on the other, by generating more synthetic training data with an SMT model. The cognates found using our method are made publicly available in the Online Dictionary of Uralic Languages.

Cite

CITATION STYLE

APA

Hämäläinen, M., & Reuter, J. (2019). Finding Sami Cognates with a Character-Based NMT Approach. Proceedings of the Workshop on Computational Methods for Endangered Languages. https://doi.org/10.33011/computel.v1i.395

Finding Sami Cognates with a Character-Based NMT Approach

Abstract

Cite

Register to see more suggestions