A Deep Convolutional Deblurring and Detection Neural Network for Localizing Text in Videos

Yang Wang; Ye Qian; Jiahao Shi; Feng Su

Conference Proceedings

A Deep Convolutional Deblurring and Detection Neural Network for Localizing Text in Videos

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 11962 LNCS 112-124

DOI: 10.1007/978-3-030-37734-2_10

0Citations

1Readers

Get full text

Abstract

Scene text in the video is usually vulnerable to various blurs like those caused by camera or text motions, which brings additional difficulty to reliably extract them from the video for content-based video applications. In this paper, we propose a novel fully convolutional deep neural network for deblurring and detecting text in the video. Specifically, to cope with blur of video text, we propose an effective deblurring subnetwork that is composed of multi-level convolutional blocks with both cross-block (long) and within-block (short) skip connections for progressively learning residual deblurred image details as well as a spatial attention mechanism to pay more attention on blurred regions, which generates the sharper image for current frame by fusing multiple surrounding adjacent frames. To further localize text in the frames, we enhance the EAST text detection model by introducing deformable convolution layers and deconvolution layers, which better capture widely varied appearances of video text. Experiments on the public scene text video dataset demonstrate the state-of-the-art performance of the proposed video text deblurring and detection model.

Author supplied keywords

Cite

CITATION STYLE

APA

Wang, Y., Qian, Y., Shi, J., & Su, F. (2020). A Deep Convolutional Deblurring and Detection Neural Network for Localizing Text in Videos. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11962 LNCS, pp. 112–124). Springer. https://doi.org/10.1007/978-3-030-37734-2_10

A Deep Convolutional Deblurring and Detection Neural Network for Localizing Text in Videos

Abstract

Author supplied keywords

Cite

Register to see more suggestions