Situated and Interactive Multimodal Conversations

32Citations
Citations of this article
121Readers
Mendeley users who have this article in their library.

Abstract

Next generation virtual assistants are envisioned to handle multimodal inputs (e.g., vision, memories of previous interactions, and the user’s utterances), and perform multimodal actions (e.g., displaying a route while generating the system’s utterance). We introduce Situated Interactive MultiModal Conversations (SIMMC) as a new direction aimed at training agents that take multimodal actions grounded in a co-evolving multimodal input context in addition to the dialog history. We provide two SIMMC datasets totalling ∼13K human-human dialogs (∼169K utterances) collected using a multimodal Wizard-of-Oz (WoZ) setup, on two shopping domains: (a) furniture – grounded in a shared virtual environment; and (b) fashion – grounded in an evolving set of images. Datasets include multimodal context of the items appearing in each scene, and contextual NLU, NLG and coreference annotations using a novel and unified framework of SIMMC conversational acts for both user and assistant utterances. Finally, we present several tasks within SIMMC as objective evaluation protocols, such as structural API prediction, response generation, and dialog state tracking. We benchmark a collection of existing models on these SIMMC tasks as strong baselines, and demonstrate rich multimodal conversational interactions. Our data, annotations, and models are publicly available.

Cite

CITATION STYLE

APA

Moon, S., Kottur, S., Crook, P. A., De, A., Poddar, S., Levin, T., … Geramifard, A. (2020). Situated and Interactive Multimodal Conversations. In COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference (pp. 1103–1121). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.coling-main.96

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free