Real-Time Understanding of Complex Discriminative Scene Descriptions

5Citations
Citations of this article
76Readers
Mendeley users who have this article in their library.

Abstract

Real-world scenes typically have complex structure, and utterances about them consequently do as well. We devise and evaluate a model that processes descriptions of complex configurations of geometric shapes and can identify the described scenes among a set of candidates, including similar distractors. The model works with raw images of scenes, and by design can work word-by-word incrementally. Hence, it can be used in highly-responsive interactive and situated settings. Using a corpus of descriptions from game-play between human subjects (who found this to be a challenging task), we show that reconstruction of description structure in our system contributes to task success and supports the performance of the word-based model of grounded semantics that we use.

Cite

CITATION STYLE

APA

Manuvinakurike, R., Kennington, C., DeVault, D., & Schlangen, D. (2016). Real-Time Understanding of Complex Discriminative Scene Descriptions. In SIGDIAL 2016 - 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Proceedings of the Conference (pp. 232–241). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w16-3630

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free