Despite recent advances, text-to-image generation on complex datasets like MSCOCO, where each image contains varied objects, is still a challenging task. In this paper, we propose a method named visual-memory Creative Adversarial Network (vmCAN) to generate images depending on their corresponding narrative sentences. vmCAN appropriately leverages an external visual knowledge memory in both multi-modal fusion and image synthesis. By conditioning synthesis on both internally textual description and externally triggered “visual proposals”, our method boosts the inception score of the baseline method by 17.6% on the challenging COCO dataset.
CITATION STYLE
Zhang, S., Dong, H., Hu, W., Guo, Y., Wu, C., Xie, D., & Wu, F. (2018). Text-to-image synthesis via visual-memory creative adversarial network. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11166 LNCS, pp. 417–427). Springer Verlag. https://doi.org/10.1007/978-3-030-00764-5_38
Mendeley helps you to discover research relevant for your work.