Emotion and empathy are examples of human qualities lacking in many human-machine interactions. The goal of our work is to generate engaging dialogue grounded in a usershared image with increased emotion and empathy while minimizing socially inappropriate or offensive outputs. We release the Neural Image Commenting with Empathy (NICE) dataset consisting of almost two million images and the corresponding human-generated comments, a set of human annotations, and baseline performance on a range of models. Instead of relying on manually labeled emotions, we also use automatically generated linguistic representations as a source of weakly supervised labels. Based on these annotations, we define two different tasks for the NICE dataset. Then, we provide a novel pre-training model -Modeling Affect Generation for Image Comments (MAGIC) - which aims to generate comments for images, conditioned on linguistic representations that capture style and affect, and to help generate more empathetic, emotional, engaging and socially appropriate comments. Using this model we achieve state-ofthe- art performance on one of our NICE tasks. The experiments show that the approach can generate more human-like and engaging image comments.
Mendeley helps you to discover research relevant for your work.
CITATION STYLE
Chen, K., Huang, Q., McDuff, D., Gao, X., Palangi, H., Wang, J., … Gao, J. (2021). NICE: Neural Image Commenting with Empathy. In Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021 (pp. 4456–4472). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-emnlp.380