Natural Scene Text Recognition Based on Encoder-Decoder Framework

50Citations
Citations of this article
36Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Aiming at the situation that complex natural scene text is difficult to recognize a scene text recognition method based on an encoder-decoder framework is proposed. The method converts the natural text recognition into a sequence mark by combining the connection time classification (CTC) and attention mechanism under the encoder-decoder framework, in order to overcome the problem of character segmentation, using the correlation between image and text sequence. First of all, a convolutional neural network (CNN) is used to generate an ordered feature sequence from the entire word image. Then, the generated feature sequence is feature-coded using the bidirectional long short-term memory (Bi-LSTM) network. Finally, an integrated module of the CTC and attention mechanism is designed to decode and output the text sequence. The experiments show that compared with the comparison method, the recognition accuracy of the method is improved obviously.

Cite

CITATION STYLE

APA

Zuo, L. Q., Sun, H. M., Mao, Q. C., Qi, R., & Jia, R. S. (2019). Natural Scene Text Recognition Based on Encoder-Decoder Framework. IEEE Access, 7, 62616–62623. https://doi.org/10.1109/ACCESS.2019.2916616

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free