Abstract
Text embedded in images provides important semantic information about a scene and its content. Detecting text in an unconstrained environment is a challenging task because of the many fonts, sizes, backgrounds, and alignments of the characters. We present a novel attention model for detecting arbitrary oriented and curved scene text. Inspired by the attention mechanisms in the human visual system, our model utilizes a spatial glimpse network to processes the attended area and deploys a recurrent neural network that aggregates the information over time to determine the attention movement. Combining this with an off-the-shelf region proposal method, the model achieves the state-of-the-art performance on the highly cited ICDAR2013 dataset, and the MSRA-TD500 dataset which contains arbitrary oriented text.
Author supplied keywords
Cite
CITATION STYLE
Huang, W., He, D., Yang, X., Zhou, Z., Kifer, D., & Giles, C. L. (2016). Detecting arbitrary oriented text in the wild with a visual attention model. In MM 2016 - Proceedings of the 2016 ACM Multimedia Conference (pp. 551–555). Association for Computing Machinery, Inc. https://doi.org/10.1145/2964284.2967282
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.