Discovering Connotations as Labels for Weakly Supervised Image-Sentence Data

Aditya Mogadala; Bhargav Kanuparthi; Achim Rettinger; York Sure-Vetter

Conference ProceedingsOPEN ACCESS

Discovering Connotations as Labels for Weakly Supervised Image-Sentence Data

The Web Conference 2018 - Companion of the World Wide Web Conference, WWW 2018 (2018) 379-386

DOI: 10.1145/3184558.3186352

1Citations

8Readers

Get full text

Abstract

Growth of multimodal content on the web and social media has generated abundant weakly aligned image-sentence pairs. However, it is hard to interpret them directly due to intrinsic intension. In this paper, we aim to annotate such image-sentence pairs with connotations as labels to capture the intrinsic intension. We achieve it with a connotation multimodal embedding model (CMEM) using a novel loss function. It's unique characteristics over previous models include: (i) the exploitation of multimodal data as opposed to only visual information, (ii) robustness to outlier labels in a multi-label scenario and (iii) works effectively with large-scale weakly supervised data. With extensive quantitative evaluation, we exhibit the effectiveness of CMEM for detection of multiple labels over other state-of-the-art approaches. Also, we show that in addition to annotation of image-sentence pairs with connotation labels, byproduct of our model inherently supports cross-modal retrieval i.e. image query - sentence retrieval.

Author supplied keywords

Cite

CITATION STYLE

APA

Mogadala, A., Kanuparthi, B., Rettinger, A., & Sure-Vetter, Y. (2018). Discovering Connotations as Labels for Weakly Supervised Image-Sentence Data. In The Web Conference 2018 - Companion of the World Wide Web Conference, WWW 2018 (pp. 379–386). Association for Computing Machinery, Inc. https://doi.org/10.1145/3184558.3186352

Discovering Connotations as Labels for Weakly Supervised Image-Sentence Data

Abstract

Author supplied keywords

Cite

Register to see more suggestions