Unsupervised Style Control for Image Captioning

Junyu Tian; Zhikun Yang; Shumin Shi

Conference Proceedings

Unsupervised Style Control for Image Captioning

Communications in Computer and Information Science (2022) 1628 CCIS 413-424

DOI: 10.1007/978-981-19-5194-7_31

1Citations

2Readers

Get full text

Abstract

We propose a novel unsupervised image captioning method. Image captioning involves two fields of deep learning, natural language processing and computer vision. The excessive pursuit of model evaluation results makes the caption style generated by the model too monotonous, which is difficult to meet people’s demands for vivid and stylized image captions. Therefore, we propose an image captioning model that combines text style transfer and image emotion recognition methods, with which the model can better understand images and generate controllable stylized captions. The proposed method can automatically judge the emotion contained in the image through the image emotion recognition module, better understand the image content, and control the description through the text style transfer method, thereby generating captions that meet people’s expectations. To our knowledge, this is the first work to use both image emotion recognition and text style control.

Author supplied keywords

Cite

CITATION STYLE

APA

Tian, J., Yang, Z., & Shi, S. (2022). Unsupervised Style Control for Image Captioning. In Communications in Computer and Information Science (Vol. 1628 CCIS, pp. 413–424). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-19-5194-7_31

Unsupervised Style Control for Image Captioning

Abstract

Author supplied keywords

Cite

Register to see more suggestions