Query is GAN: Scene Retrieval with Attentional Text-To-Image Generative Adversarial Network

Rintaro Yanagi; Ren Togo; Takahiro Ogawa; Miki Haseyama

Journal ArticleOPEN ACCESS

Query is GAN: Scene Retrieval with Attentional Text-To-Image Generative Adversarial Network

IEEE Access (2019) 7 153183-153193

DOI: 10.1109/ACCESS.2019.2947409

10Citations

30Readers

Abstract

Scene retrieval from input descriptions has been one of the most important applications with the increasing number of videos on the Web. However, this is still a challenging task since semantic gaps between features of texts and videos exist. In this paper, we try to solve this problem by utilizing a text-To-image Generative Adversarial Network (GAN), which has become one of the most attractive research topics in recent years. The text-To-image GAN is a deep learning model that can generate images from their corresponding descriptions. We propose a new retrieval framework, 'Query is GAN', based on the text-To-image GAN that drastically improves scene retrieval performance by simple procedures. Our novel idea makes use of images generated by the text-To-image GAN as queries for the scene retrieval task. In addition, unlike many studies on text-To-image GANs that mainly focused on the generation of high-quality images, we reveal that the generated images have reasonable visual features suitable for the queries even though they are not visually pleasant. We show the effectiveness of the proposed framework through experimental evaluation in which scene retrieval is performed from real video datasets.

Author supplied keywords

Cite

CITATION STYLE

APA

Yanagi, R., Togo, R., Ogawa, T., & Haseyama, M. (2019). Query is GAN: Scene Retrieval with Attentional Text-To-Image Generative Adversarial Network. IEEE Access, 7, 153183–153193. https://doi.org/10.1109/ACCESS.2019.2947409

Query is GAN: Scene Retrieval with Attentional Text-To-Image Generative Adversarial Network

Abstract

Author supplied keywords

Cite

Register to see more suggestions