A unified deep neural network for scene text detection

2Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Scene text detection is important and valuable for text recognition in natural scenes, but it is still a very challenging problem. In this paper, we propose a unified deep neural network for scene text detection, which is composed of a Fully Convolutional Network (FCN) for text saliency map generation and a Bounding box Regression Network (BRN) for text bounding boxes prediction. The FCN is trained with a hybrid loss function based on two types of pixel-wise ground truth masks while the unified neural network is fine-tuned with a multitask loss function. Additionally, the post-processing procedures including scoring the predicted bounding boxes by the saliency map and eliminating the redundant boxes via the Non-Maximum Suppression (NMS) method are applied to improve the final text detection results. It is demonstrated by the experimental results on ICDAR2013 benchmark that our proposed unified deep neural network can achieve good performance of text detection and process images at 5 fps, being faster than most of the existing text detection methods.

Cite

CITATION STYLE

APA

Li, Y., & Ma, J. (2017). A unified deep neural network for scene text detection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10361 LNCS, pp. 101–112). Springer Verlag. https://doi.org/10.1007/978-3-319-63309-1_10

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free