Language-agnostic Semantic Consistent Text-to-Image Generation

2Citations
Citations of this article
39Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Recent GAN-based text-to-image generation models have advanced that they can generate photo-realistic images matching semantically with descriptions. However, research on multilingual text-to-image generation has not been carried out yet much. There are two problems when constructing a multilingual text-to-image generation model: 1) language imbalance issue in text-to-image paired datasets and 2) generating images that have the same meaning but are semantically inconsistent with each other in texts expressed in different languages. To this end, we propose a Language-agnostic Semantic Consistent Generative Adversarial Network (LaSC-GAN) for text-to-image generation, which can generate semantically consistent images via language-agnostic text encoder and Siamese mechanism. Experiments on relatively low-resource language text-image datasets show that the model has comparable generation quality as images generated by high-resource language text, and generates semantically consistent images for texts with the same meaning even in different languages.

Cite

CITATION STYLE

APA

Jung, S. J., Choi, W. S., Choi, S., & Zhang, B. T. (2022). Language-agnostic Semantic Consistent Text-to-Image Generation. In MML 2022 - 1st Workshop on Multilingual Multimodal Learning, Proceedings of the Workshop (pp. 1–5). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.mml-1.1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free