Multimodal Semantics for Affordances and Actions

James Pustejovsky; Nikhil Krishnaswamy

Conference Proceedings

Multimodal Semantics for Affordances and Actions

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2022) 13302 LNCS 137-160

DOI: 10.1007/978-3-031-05311-5_9

4Citations

2Readers

Get full text

Abstract

In this paper, we argue that, as HCI becomes more multimodal with the integration of gesture, gaze, posture, and other nonverbal behavior, it is important to understand the role played by affordances and their associated actions in human-object interactions (HOI), so as to facilitate reasoning in HCI and HRI environments. We outline the requirements and challenges involved in developing a multimodal semantics for human-computer and human-robot interactions. Unlike unimodal interactive agents (e.g., text-based chatbots or voice-based personal digital assistants), multimodal HCI and HRI inherently require a notion of embodiment, or an understanding of the agent’s placement within the environment and that of its interlocutor. We present a dynamic semantics of the language, VoxML, to model human-computer, human-robot, and human-human interactions by creating multimodal simulations of both the communicative content and the agents’ common ground, and show the utility of VoxML information that is reified within the environment on computational understanding of objects for HOI.

Author supplied keywords

Cite

CITATION STYLE

APA

Pustejovsky, J., & Krishnaswamy, N. (2022). Multimodal Semantics for Affordances and Actions. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13302 LNCS, pp. 137–160). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-05311-5_9

Multimodal Semantics for Affordances and Actions

Abstract

Author supplied keywords

Cite

Register to see more suggestions