Real-Time Understanding of Complex Discriminative Scene Descriptions

Ramesh Manuvinakurike; Casey Kennington; David DeVault; David Schlangen

Conference ProceedingsOPEN ACCESS

Real-Time Understanding of Complex Discriminative Scene Descriptions

SIGDIAL 2016 - 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Proceedings of the Conference (2016) 232-241

DOI: 10.18653/v1/w16-3630

5Citations

76Readers

Abstract

Real-world scenes typically have complex structure, and utterances about them consequently do as well. We devise and evaluate a model that processes descriptions of complex configurations of geometric shapes and can identify the described scenes among a set of candidates, including similar distractors. The model works with raw images of scenes, and by design can work word-by-word incrementally. Hence, it can be used in highly-responsive interactive and situated settings. Using a corpus of descriptions from game-play between human subjects (who found this to be a challenging task), we show that reconstruction of description structure in our system contributes to task success and supports the performance of the word-based model of grounded semantics that we use.

Cite

CITATION STYLE

APA

Manuvinakurike, R., Kennington, C., DeVault, D., & Schlangen, D. (2016). Real-Time Understanding of Complex Discriminative Scene Descriptions. In SIGDIAL 2016 - 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Proceedings of the Conference (pp. 232–241). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w16-3630

Real-Time Understanding of Complex Discriminative Scene Descriptions

Abstract

Cite

Register to see more suggestions