Detecting arbitrary oriented text in the wild with a visual attention model

13Citations
Citations of this article
20Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Text embedded in images provides important semantic information about a scene and its content. Detecting text in an unconstrained environment is a challenging task because of the many fonts, sizes, backgrounds, and alignments of the characters. We present a novel attention model for detecting arbitrary oriented and curved scene text. Inspired by the attention mechanisms in the human visual system, our model utilizes a spatial glimpse network to processes the attended area and deploys a recurrent neural network that aggregates the information over time to determine the attention movement. Combining this with an off-the-shelf region proposal method, the model achieves the state-of-the-art performance on the highly cited ICDAR2013 dataset, and the MSRA-TD500 dataset which contains arbitrary oriented text.

Cite

CITATION STYLE

APA

Huang, W., He, D., Yang, X., Zhou, Z., Kifer, D., & Giles, C. L. (2016). Detecting arbitrary oriented text in the wild with a visual attention model. In MM 2016 - Proceedings of the 2016 ACM Multimedia Conference (pp. 551–555). Association for Computing Machinery, Inc. https://doi.org/10.1145/2964284.2967282

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free