Demonstrating EMMA: Embodied MultiModal Agent for Language-guided Action Execution in 3D Simulated Environments

Alessandro Suglia; Bhathiya Hemanthage; Malvina Nikandrou; Georgios Pantazopoulos; Amit Parekh; Arash Eshghi; Claudio Greco; Ioannis Konstas; Oliver Lemon; Verena Rieser

Conference Proceedings

Demonstrating EMMA: Embodied MultiModal Agent for Language-guided Action Execution in 3D Simulated Environments

SIGDIAL 2022 - 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, Proceedings of the Conference (2022) 649-653

DOI: 10.18653/v1/2022.sigdial-1.62

2Citations

26Readers

Get full text

Abstract

We demonstrate EMMA, an embodied multimodal agent which has been developed for the Alexa Prize SimBot Challenge1. The agent acts within a 3D simulated environment for household tasks. EMMA is a unified and multimodal generative model aimed at solving embodied tasks. In contrast to previous work, our approach treats multiple multimodal tasks as a single multimodal conditional text generation problem. Furthermore, we showcase that a single generative agent can solve tasks with visual inputs of varying length, such as answering questions about static images, or executing actions given a sequence of previous frames and dialogue utterances. The demo system will allow users to interact conversationally with EMMA in embodied dialogues in different 3D environments from the TEACh dataset.

Cite

CITATION STYLE

APA

Suglia, A., Hemanthage, B., Nikandrou, M., Pantazopoulos, G., Parekh, A., Eshghi, A., … Rieser, V. (2022). Demonstrating EMMA: Embodied MultiModal Agent for Language-guided Action Execution in 3D Simulated Environments. In SIGDIAL 2022 - 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, Proceedings of the Conference (pp. 649–653). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.sigdial-1.62

Demonstrating EMMA: Embodied MultiModal Agent for Language-guided Action Execution in 3D Simulated Environments

Abstract

Cite

Register to see more suggestions