An Attention-Based Convolutional Recurrent Neural Networks for Scene Text Recognition

20Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Text recognition is critical in various domains, including driving assistance, handwriting recognition, and aiding the visually impaired. In recent years, deep learning-based methods have demonstrated outstanding performance in Scene Text Recognition (STR). However, STR poses significant challenges, and the scarcity of non-Latin language datasets further compounds these challenges. To address this, we collected a dataset of Persian digits, including 20000 images with different challenges, making the dataset appropriate for text recognition task. Furthermore, we propose a convolutional-based model that incorporates the squeeze and excitation gate, forcing the model to focus on latent features, and connectionist temporal classification, enabling end-to-end sequence learning, for Persian digit recognition. We conduct extensive comparisons with different architectures and models to evaluate the performance of our proposed model. As a result, our approach achieves an accuracy of 94.26 on our datasets. The results demonstrate that our model outperforms the other methods, highlighting its effectiveness in Persian digit recognition.

Cite

CITATION STYLE

APA

Alshawi, A. A. A., Tanha, J., & Balafar, M. A. (2024). An Attention-Based Convolutional Recurrent Neural Networks for Scene Text Recognition. IEEE Access, 12, 8123–8134. https://doi.org/10.1109/ACCESS.2024.3352748

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free