Representation of Yine (Arawak) Morphology by Finite State Transducer Formalism

3Citations
Citations of this article
51Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We represent the complexity of Yine (Arawak) morphology with a finite state transducer (FST) based morphological analyzer. Yine is a low-resource indigenous polysynthetic Peruvian language spoken by approximately 3,000 people and is classified as ‘definitely endangered’ by UNESCO. We review Yine morphology focusing on morphophonology, possessive constructions and verbal predicates. Then we develop FSTs to model these components proposing techniques to solve challenging problems such as complex patterns of incorporating open and closed category arguments. This is a work in progress and we still have more to do in the development and verification of our analyzer. Our analyzer will serve both as a tool to better document the Yine language and as a component of natural language processing (NLP) applications such as spell checking and correction.

Cite

CITATION STYLE

APA

Ingunza, A. M., Miller, J. E., Oncevay, A., & Zariquiey, R. (2021). Representation of Yine (Arawak) Morphology by Finite State Transducer Formalism. In Proceedings of the 1st Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2021 (pp. 102–112). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.americasnlp-1.11

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free