Arabic diacritization with recurrent neural networks

53Citations
Citations of this article
120Readers
Mendeley users who have this article in their library.

Abstract

Arabic, Hebrew, and similar languages are typically written without diacritics, leading to ambiguity and posing a major challenge for core language processing tasks like speech recognition. Previous approaches to automatic diacritization employed a variety of machine learning techniques. However, they typically rely on existing tools like morphological analyzers and therefore cannot be easily extended to new genres and languages. We develop a recurrent neural network with long shortterm memory layers for predicting diacritics in Arabic text. Our language-independent approach is trained solely from diacritized text without relying on external tools. We show experimentally that our model can rival state-of-the-art methods that have access to additional resources.

Cite

CITATION STYLE

APA

Belinkov, Y., & Glass, J. (2015). Arabic diacritization with recurrent neural networks. In Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing (pp. 2281–2285). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d15-1274

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free