StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Translation

11Citations
Citations of this article
33Readers
Mendeley users who have this article in their library.

Abstract

Generating images that fit a given text description using machine learning has improved greatly with the release of technologies such as the CLIP image-text encoder model; however, current methods lack artistic control of the style of image to be generated. We present an approach for generating styled drawings for a given text description where a user can specify a desired drawing style using a sample image. Inspired by a theory in art that style and content are generally inseparable during the creative process, we propose a coupled approach, known here as StyleCLIPDraw, whereby the drawing is generated by optimizing for style and content simultaneously throughout the process as opposed to applying style transfer after creating content in a sequence. Based on human evaluation, the styles of images generated by StyleCLIPDraw are strongly preferred to those by the sequential approach. Although the quality of content generation degrades for certain styles, overall considering both content and style, StyleCLIPDraw is found far more preferred, indicating the importance of style, look, and feel of machine generated images to people as well as indicating that style is coupled in the drawing process itself. Our code, a demonstration, and style evaluation data are publicly available.

Cite

CITATION STYLE

APA

Schaldenbrand, P., Liu, Z., & Oh, J. (2022). StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Translation. In IJCAI International Joint Conference on Artificial Intelligence (pp. 4966–4972). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2022/688

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free