Scene text spotting aims at simultaneously localizing and recognizing text instances, symbols, and logos in natural scene images. Scene text detection and recognition approaches have received immense attention in computer vision research community. The presence of partial occlusion or truncation artifact due to the cluttered background of scene images creates an obstacle in perceiving the text instances, which makes the process of spotting very complex. In this paper, we propose a light-weight scene text spotter that can address the issue of cluttered environment of scene images. It is an end-to-end trainable deep neural network that uses local part information, global structural features, and context cue information of oriented region proposals for spotting text instances. It helps to localize in scene images with background clutters, where partially occluded text parts, truncation artifacts, and perspective distortions are present. We mitigate the problem of misclassification caused by inter-class interference by exploring inter-class separability and intra-class compactness. We also incorporate multi-language character segmentation and word-level recognition in a light-weight recognition module. We have used six publicly available benchmark datasets in different smart devices to illustrate the efficacy of the network.
CITATION STYLE
Bagi, R., Dutta, T., & Gupta, H. P. (2020). Cluttered TextSpotter: An End-to-End Trainable Light-Weight Scene Text Spotter for Cluttered Environment. IEEE Access, 8, 111433–111447. https://doi.org/10.1109/ACCESS.2020.3002808
Mendeley helps you to discover research relevant for your work.