Text-to-Image Generation Using Deep Learning †

Sadia Ramzan; Muhammad Munwar Iqbal; Tehmina Kalsum

Journal ArticleOPEN ACCESS

Text-to-Image Generation Using Deep Learning †

Engineering Proceedings (2022) 20(1)

DOI: 10.3390/engproc2022020016

14Citations

58Readers

Abstract

Text-to-image generation is a method used for generating images related to given textual descriptions. It has a significant influence on many research areas as well as a diverse set of applications (e.g., photo-searching, photo-editing, art generation, computer-aided design, image reconstruction, captioning, and portrait drawing). The most challenging task is to consistently produce realistic images according to given conditions. Existing algorithms for text-to-image generation create pictures that do not properly match the text. We considered this issue in our study and built a deep learning-based architecture for semantically consistent image generation: recurrent convolutional generative adversarial network (RC-GAN). RC-GAN successfully bridges the advancements in text and picture modelling, converting visual notions from words to pixels. The proposed model was trained on the Oxford-102 flowers dataset, and its performance was evaluated using an inception score and PSNR. The experimental results demonstrate that our model is capable of generating more realistic photos of flowers from given captions, with an inception score of 4.15 and a PSNR value of 30.12 dB, respectively. In the future, we aim to train the proposed model on multiple datasets.

Author supplied keywords

Cite

CITATION STYLE

APA

Ramzan, S., Iqbal, M. M., & Kalsum, T. (2022). Text-to-Image Generation Using Deep Learning †. Engineering Proceedings, 20(1). https://doi.org/10.3390/engproc2022020016

Text-to-Image Generation Using Deep Learning †

Abstract

Author supplied keywords

Cite

Register to see more suggestions