Scenario referring expression comprehension via attributes of vision and language

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Referring Expression Comprehension (REC) is a task that requires to indicate particular objects within an image by natural language expressions. Previous studies on this task have assumed that the language expression and the image are one-to-one correspondence, that is, the language refers to the target region must exist in the current image and then the region with the highest score will be located, no matter whether they match or not. However, in practical applications, REC is required to locate the reference target region from a series of matched, semi-matched and mismatched scene image sequences. It is the 3D version of this challenge that refers to as Scenario Referring Expression Comprehension (SREC) in this paper. To accomplish such a task, we made a testset based on the existing real-scenario dataset enhancement, constructed a Dual Attributes Recursive Retrieve Reasoning Model (DA3R) for the first time with the Attributes of both images and expressions, and finally verified the feasibility of the method on the testset by assess with three different types of enhanced expression.

Cite

CITATION STYLE

APA

Wei, S., Wang, J., Sun, Y., Jin, G., Liang, J., & Liu, K. (2019). Scenario referring expression comprehension via attributes of vision and language. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11859 LNCS, pp. 430–441). Springer. https://doi.org/10.1007/978-3-030-31726-3_37

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free