Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval

Sebastian Schuster; Ranjay Krishna; Angel Chang; Li Fei-Fei; Christopher D. Manning

Conference ProceedingsOPEN ACCESS

Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval

A Workshop of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015 - Workshop on Vision and Language 2015, VL 2015: Vision and Language Meet Cognitive Systems - Proceedings (2015) 70-80

DOI: 10.18653/v1/w15-2812

286Citations

285Readers

Abstract

Semantically complex queries which include attributes of objects and relations between objects still pose a major challenge to image retrieval systems. Recent work in computer vision has shown that a graph-based semantic representation called a scene graph is an effective representation for very detailed image descriptions and for complex queries for retrieval. In this paper, we show that scene graphs can be effectively created automatically from a natural language scene description. We present a rule-based and a classifier-based scene graph parser whose output can be used for image retrieval. We show that including relations and attributes in the query graph outperforms a model that only considers objects and that using the output of our parsers is almost as effective as using human-constructed scene graphs (Recall@10 of 27.1% vs. 33.4%). Additionally, we demonstrate the general usefulness of parsing to scene graphs by showing that the output can also be used to generate 3D scenes.

Cite

CITATION STYLE

APA

Schuster, S., Krishna, R., Chang, A., Fei-Fei, L., & Manning, C. D. (2015). Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval. In A Workshop of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015 - Workshop on Vision and Language 2015, VL 2015: Vision and Language Meet Cognitive Systems - Proceedings (pp. 70–80). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w15-2812

Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval

Abstract

Cite

Register to see more suggestions