Look back again: Dual parallel attention network for accurate and robust scene text recognition

8Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Nowadays, it is a trend that using a parallel-decoupled encoder-decoder (PDED) framework in scene text recognition for its flexibility and efficiency. However, due to the inconsistent information content between queries and keys in the parallel positional attention module (PPAM) used in this kind of framework(queries: position information, keys: context and position information), visual misalignment tends to appear when confronting hard samples(e.g., blurred texts, irregular texts, or low-quality images). To tackle this issue, in this paper, we propose a dual parallel attention network (DPAN), in which a newly designed parallel context attention module (PCAM) is cascaded with the original PPAM, using linguistic contextual information to compensate for the information inconsistency between queries and keys. Specifically, in PCAM, we take the visual features from PPAM as inputs and present a bidirectional language model to enhance them with linguistic contexts to produce queries. In this way, we make the information content of the queries and keys consistent in PCAM, which helps to generate more precise visual glimpses to improve the entire PDED framework's accuracy and robustness. Experimental results verify the effectiveness of the proposed PCAM, showing the necessity of keeping the information consistency between queries and keys in the attention mechanism. On six benchmarks, including regular text and irregular text, the performance of DPAN surpasses the existing leading methods by large margins, achieving new state-of-the-art performance. The code is available on \urlhttps://github.com/Jackandrome/DPAN.

Cite

CITATION STYLE

APA

Fu, Z., Xie, H., Jin, G., & Guo, J. (2021). Look back again: Dual parallel attention network for accurate and robust scene text recognition. In ICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval (pp. 638–644). Association for Computing Machinery, Inc. https://doi.org/10.1145/3460426.3463674

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free