“show me the cup”: Reference with continuous representations

0Citations
Citations of this article
28Readers
Mendeley users who have this article in their library.
Get full text

Abstract

One of the most basic functions of language is to refer to objects in a shared scene. Modeling reference with continuous representations is challenging because it requires individuation, i.e., tracking and distinguishing an arbitrary number of referents. We introduce a neural network model that, given a definite description and a set of objects represented by natural images, points to the intended object if the expression has a unique referent, or indicates a failure, if it does not. The model, directly trained on reference acts, is competitive with a pipeline manually engineered to perform the same task, both when referents are purely visual, and when they are characterized by a combination of visual and linguistic properties.

Cite

CITATION STYLE

APA

Baroni, M., Boleda, G., & Padó, S. (2018). “show me the cup”: Reference with continuous representations. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10761 LNCS, pp. 209–224). Springer Verlag. https://doi.org/10.1007/978-3-319-77113-7_17

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free