Using object detection, NLP, and knowledge bases to understand the message of images

Lydia Weiland; Ioana Hulpus; Simone Paolo Ponzetto; Laura Dietz

Conference Proceedings

Using object detection, NLP, and knowledge bases to understand the message of images

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10133 LNCS 405-418

DOI: 10.1007/978-3-319-51814-5_34

3Citations

8Readers

Get full text

Abstract

With the increasing amount of multimodal content from social media posts and news articles, there has been an intensified effort towards conceptual labeling and multimodal (topic) modeling of images and of their affiliated texts. Nonetheless, the problem of identifying and automatically naming the core abstract message (gist) behind images has received less attention. This problem is especially relevant for the semantic indexing and subsequent retrieval of images. In this paper, we propose a solution that makes use of external knowledge bases such as Wikipedia and DBpedia. Its aim is to leverage complex semantic associations between the image objects and the textual caption in order to uncover the intended gist. The results of our evaluation prove the ability of our proposed approach to detect gist with a best MAP score of 0.74 when assessed against human annotations. Furthermore, an automatic image tagging and caption generation API is compared to manually set image and caption signals. We show and discuss the difficulty to find the correct gist especially for abstract, non-depictable gists as well as the impact of different types of signals on gist detection quality.

Cite

CITATION STYLE

APA

Weiland, L., Hulpus, I., Ponzetto, S. P., & Dietz, L. (2017). Using object detection, NLP, and knowledge bases to understand the message of images. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10133 LNCS, pp. 405–418). Springer Verlag. https://doi.org/10.1007/978-3-319-51814-5_34

Using object detection, NLP, and knowledge bases to understand the message of images

Abstract

Cite

Register to see more suggestions