Multi-modal Language Models for Human-Robot Interaction

2Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The recent progress in language models is enabling more flexible and natural conversation abilities for social robots. However, these language models were never made to be used in a physically embodied social agent. They lack the ability to process the other modalities humans use in conversation, such as vision, to make references to the environment and understand non-verbal communication. My work promotes the design of language models for physically embodied social interactions, shows how current technologies can be leveraged to enrich language models with these abilities, and explores how such multi-modal language models can be used to improve interactions.

Cite

CITATION STYLE

APA

Janssens, R. (2024). Multi-modal Language Models for Human-Robot Interaction. In ACM/IEEE International Conference on Human-Robot Interaction (pp. 109–111). IEEE Computer Society. https://doi.org/10.1145/3610978.3638371

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free