A Deep Diacritics-Based Recognition Model for Arabic Speech: Quranic Verses as Case Study

Sarah S. Alrumiah; Amal A. Al-Shargabi

Journal ArticleOPEN ACCESS

A Deep Diacritics-Based Recognition Model for Arabic Speech: Quranic Verses as Case Study

IEEE Access (2023) 11 81348-81360

DOI: 10.1109/ACCESS.2023.3300972

3Citations

19Readers

Abstract

Arabic is the language of more than 422 million of the world's population. Although classic Arabic is the Quran language that 1.9 billion Muslims are required to recite, limited Arabic speech recognition exists. In classic Arabic, diacritics affect the pronunciation of a word, a change in a diacritic can change the meaning of a word. However, most of the Arabic-based speech recognition models discarded the diacritics. This work aims to recognize the classic Arabic speech while considering diacritics by converting audio signals to diacritized text using Deep Neural Network (DNN)-based models. The DNN-based model recognizes speech using DNN which outperformed the traditional speech recognition systems' phonetics dependency. Three models were developed to recognize Arabic speech: (i) Time Delay Neural Network-Connectionist Temporal Classification (CTC), (ii) Recurrent Neural Network (RNN)-CTC, and (iii) transformer. A 100hours dataset of the Quran recordings has been used. Based on the results, the RNN-CTC model obtained state-of-the-art results with the lowest word error rate of 19.43% and a 3.51% character error rate. RNN-CTC model recognized character-by-character which is more reliable compared to transformers' whole-sentence recognition behaviour. The model performed well with clear unstressed recordings of short sentences. Moreover, the RNN-CTC model effectively recognized out-of-the-dataset sounds. The findings recommend continuing the efforts in enhancing the diacritics-based Arabic speech recognition models using clear and unstressed recordings to obtain better performance. Moreover, pretraining large speech models could obtain accurate recognition. The outcomes can be used to enhance the existing classic Arabic speech recognition solutions by supporting diacritics recognition.

Author supplied keywords

Cite

CITATION STYLE

APA

Alrumiah, S. S., & Al-Shargabi, A. A. (2023). A Deep Diacritics-Based Recognition Model for Arabic Speech: Quranic Verses as Case Study. IEEE Access, 11, 81348–81360. https://doi.org/10.1109/ACCESS.2023.3300972

A Deep Diacritics-Based Recognition Model for Arabic Speech: Quranic Verses as Case Study

Abstract

Author supplied keywords

Cite

Register to see more suggestions