Scene Graph Generation Using Depth, Spatial, and Visual Cues in 2D Images

Aiswarya S. Kumar; Jyothisha J. Nair

Journal ArticleOPEN ACCESS

Scene Graph Generation Using Depth, Spatial, and Visual Cues in 2D Images

IEEE Access (2022) 10 1968-1978

DOI: 10.1109/ACCESS.2021.3139000

7Citations

5Readers

Abstract

To understand an image or a scene properly, it is necessary to identify objects participating in the scene, their relationships, and various attributes that describe their properties. A scene graph is a high-level representation that confines all these features in a structured manner. Scene graph generation includes multiple challenges like the semantics of relationships considered and the availability of a well-balanced dataset with sufficient training examples. We tried to mitigate these problems by extracting two subsets, VG-R10 and VG-A16, from the popular Visual Genome dataset. Also, a framework (S2G) is proposed for generating scene graphs directly from images using depth and spatial information of object pairs. Evaluations on the scene graph generation model reveal that the proposed framework achieves better results on our data than the state-of-the-art.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Kumar, A. S., & Nair, J. J. (2022). Scene Graph Generation Using Depth, Spatial, and Visual Cues in 2D Images. IEEE Access, 10, 1968–1978. https://doi.org/10.1109/ACCESS.2021.3139000

Readers over time

Readers' Seniority

PhD / Post grad / Masters / Doc 2

100%

Readers' Discipline

Computer Science 1

50%

Engineering 1

50%

Scene Graph Generation Using Depth, Spatial, and Visual Cues in 2D Images

Abstract

Author supplied keywords

References Powered by Scopus

Rich feature hierarchies for accurate object detection and semantic segmentation

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture

Cited by Powered by Scopus

Synergy of CNN with Random Forest Based Hybrid Architecture to Estimate the Quality of Coherent Optical Communication

Towards a joint semantic analysis in mobile forensics environments

Review on scene graph generation methods

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline