Exploiting Attention for Visual Relationship Detection

5Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Visual relationship detection targets on predicting categories of predicates and object pairs, and also locating the object pairs. Recognizing the relationships between individual objects is important for describing visual scenes in static images. In this paper, we propose a novel end-to-end framework on the visual relationship detection task. First, we design a spatial attention model for specializing predicate features. Compared to a normal ROI-pooling layer, this structure significantly improves Predicate Classification performance. Second, for extracting relative spatial configuration, we propose to map simple geometric representations to a high dimension, which boosts relationship detection accuracy. Third, we implement a feature embedding model with a bi-directional RNN which considers subject, predicate and object as a time sequence. We evaluate our method on three tasks. The experiments demonstrate that our method achieves competitive results compared to state-of-the-art methods.

Cite

CITATION STYLE

APA

Hu, T., Liao, W., Yang, M. Y., & Rosenhahn, B. (2019). Exploiting Attention for Visual Relationship Detection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11824 LNCS, pp. 331–344). Springer. https://doi.org/10.1007/978-3-030-33676-9_23

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free