Abstract
In this paper, we introduce a simulation platform for modeling and building Embodied Human-Computer Interactions (EHCI). This system, VoxWorld, is a multimodal dialogue system enabling communication through language, gesture, action, facial expressions, and gaze tracking, in the context of task-oriented interactions. A multimodal simulation is an embodied 3D virtual realization of both the situational environment and the co-situated agents, as well as the most salient content denoted by communicative acts in a discourse. It is built on the modeling language VoxML [7], which encodes objects with rich semantic typing and action affordances, and actions themselves as multimodal programs, enabling contextu-ally salient inferences and decisions in the environment. VoxWorld enables an embodied HCI by situating both human and computational agents within the same virtual simulation environment, where they share perceptual and epistemic common ground.
Author supplied keywords
Cite
CITATION STYLE
Pustejovsky, J., & Krishnaswamy, N. (2020). Embodied Human-Computer Interactions through Situated Grounding. In Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents, IVA 2020. Association for Computing Machinery, Inc. https://doi.org/10.1145/3383652.3423910
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.