Construction of Scene Tibetan Dataset Based on GAN

1Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

For the research of Tibetan scene text detection and recognition, it is time-consuming and laborious to collect and annotate natural scene data labels manually. Therefore, artificial synthetic data is of great significance for promoting relevant work. This paper focuses on the study of replacing other languages in the scene with Tibetan, while maintaining the style of the original text. We decompose the problem into three sub-networks: text style transfer network, background inpainting network and fusion network. Firstly, the text in the foreground image is replaced by the text style transfer network to generate the foreground image. Then the background inpainting network erases the original text in the style image, and uses the surrounding information around the text to fill the text area to generate the background image. Finally, the generated foreground image and background image are used to generate the target image by the fusion network. We experimented with conversions from English to Tibetan and English to English to verify the generalization and robustness of the network. Experimental results show that its accuracy (SSIM, PSNR) on some datasets (SVT, ICDAR 2013) has been improved to some extent.

Cite

CITATION STYLE

APA

Zhang, G., Wang, W., Zhao, P., & Li, J. (2021). Construction of Scene Tibetan Dataset Based on GAN. In Journal of Physics: Conference Series (Vol. 1871). IOP Publishing Ltd. https://doi.org/10.1088/1742-6596/1871/1/012130

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free