OTEANN: Estimating the Transparency of Orthographies with an Artificial Neural Network

16Citations
Citations of this article
65Readers
Mendeley users who have this article in their library.
Get full text

Abstract

To transcribe spoken language to written medium, most alphabets enable an unambiguous sound-to-letter rule. However, some writing systems have distanced themselves from this simple concept and little work exists in Natural Language Processing (NLP) on measuring such distance. In this study, we use an Artificial Neural Network (ANN) model to evaluate the transparency between written words and their pronunciation, hence its name Orthographic Transparency Estimation with an ANN (OTEANN). Based on datasets derived from Wikimedia dictionaries, we trained and tested this model to score the percentage of correct predictions in phoneme-to-grapheme and grapheme-to-phoneme translation tasks. The scores obtained on 17 orthographies were in line with the estimations of other studies. Interestingly, the model also provided insight into typical mistakes made by learners who only consider the phonemic rule in reading and writing.

Cite

CITATION STYLE

APA

Marjou, X. (2021). OTEANN: Estimating the Transparency of Orthographies with an Artificial Neural Network. In SIGTYP 2021 - 3rd Workshop on Research in Computational Typology and Multilingual NLP, Proceedings of the Workshop (pp. 1–9). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.sigtyp-1.1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free