Arabic, Hebrew, and similar languages are typically written without diacritics, leading to ambiguity and posing a major challenge for core language processing tasks like speech recognition. Previous approaches to automatic diacritization employed a variety of machine learning techniques. However, they typically rely on existing tools like morphological analyzers and therefore cannot be easily extended to new genres and languages. We develop a recurrent neural network with long shortterm memory layers for predicting diacritics in Arabic text. Our language-independent approach is trained solely from diacritized text without relying on external tools. We show experimentally that our model can rival state-of-the-art methods that have access to additional resources.
CITATION STYLE
Belinkov, Y., & Glass, J. (2015). Arabic diacritization with recurrent neural networks. In Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing (pp. 2281–2285). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d15-1274
Mendeley helps you to discover research relevant for your work.