Text-to-image synthesis via visual-memory creative adversarial network

Shengyu Zhang; Hao Dong; Wei Hu; Yike Guo; Chao Wu; Di Xie; Fei Wu

Conference Proceedings

Text-to-image synthesis via visual-memory creative adversarial network

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11166 LNCS 417-427

DOI: 10.1007/978-3-030-00764-5_38

5Citations

11Readers

Get full text

Abstract

Despite recent advances, text-to-image generation on complex datasets like MSCOCO, where each image contains varied objects, is still a challenging task. In this paper, we propose a method named visual-memory Creative Adversarial Network (vmCAN) to generate images depending on their corresponding narrative sentences. vmCAN appropriately leverages an external visual knowledge memory in both multi-modal fusion and image synthesis. By conditioning synthesis on both internally textual description and externally triggered “visual proposals”, our method boosts the inception score of the baseline method by 17.6% on the challenging COCO dataset.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhang, S., Dong, H., Hu, W., Guo, Y., Wu, C., Xie, D., & Wu, F. (2018). Text-to-image synthesis via visual-memory creative adversarial network. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11166 LNCS, pp. 417–427). Springer Verlag. https://doi.org/10.1007/978-3-030-00764-5_38

Text-to-image synthesis via visual-memory creative adversarial network

Abstract

Author supplied keywords

Cite

Register to see more suggestions