Aggregating local context for accurate scene text detection

9Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Scene text reading continues to be of interest for many reasons including applications for the visually impaired and automatic image indexing systems. Here we propose a novel end-to-end scene text detection algorithm. First, for identifying text regions we design a novel Convolutional Neural Network (CNN) architecture that aggregates local surrounding information for cascaded, fast and accurate detection. The local information serves as context and provides rich cues to distinguish text from background noises. In addition, we designed a novel grouping algorithm on top of detected character graph as well as a text line refinement step. Text line refinement consists of a text line extension module, together with a text line filtering and regression module. Jointly they produce accurate oriented text line bounding box. Experiments show that our method achieved state-of-the-art performance in several benchmark data sets: ICDAR 2003 (IC03), ICDAR 2013 (IC13) and Street View Text (SVT).

Cite

CITATION STYLE

APA

He, D., Yang, X., Huang, W., Zhou, Z., Kifer, D., & Giles, C. L. (2017). Aggregating local context for accurate scene text detection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10115 LNCS, pp. 280–296). Springer Verlag. https://doi.org/10.1007/978-3-319-54193-8_18

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free