Conditional GaN with discriminative filter generation for text-to-video synthesis

Yogesh Balaji; Martin Renqiang Min; Bing Bai; Rama Chellappa; Hans Peter Graf

Conference ProceedingsOPEN ACCESS

Conditional GaN with discriminative filter generation for text-to-video synthesis

IJCAI International Joint Conference on Artificial Intelligence (2019) 2019-August 1995-2001

DOI: 10.24963/ijcai.2019/276

63Citations

73Readers

Abstract

Developing conditional generative models for text-to-video synthesis is an extremely challenging yet an important topic of research in machine learning. In this work, we address this problem by introducing Text-Filter conditioning Generative Adversarial Network (TFGAN), a conditional GAN model with a novel multi-scale text-conditioning scheme that improves text-video associations. By combining the proposed conditioning scheme with a deep GAN architecture, TFGAN generates high quality videos from text on challenging real-world video datasets. In addition, we construct a synthetic dataset of text-conditioned moving shapes to systematically evaluate our conditioning scheme. Extensive experiments demonstrate that TFGAN significantly outperforms existing approaches, and can also generate videos of novel categories not seen during training.

Cite

CITATION STYLE

APA

Balaji, Y., Min, M. R., Bai, B., Chellappa, R., & Graf, H. P. (2019). Conditional GaN with discriminative filter generation for text-to-video synthesis. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2019-August, pp. 1995–2001). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2019/276

Conditional GaN with discriminative filter generation for text-to-video synthesis

Abstract

Cite

Register to see more suggestions