Abstract
The recent progress in language models is enabling more flexible and natural conversation abilities for social robots. However, these language models were never made to be used in a physically embodied social agent. They lack the ability to process the other modalities humans use in conversation, such as vision, to make references to the environment and understand non-verbal communication. My work promotes the design of language models for physically embodied social interactions, shows how current technologies can be leveraged to enrich language models with these abilities, and explores how such multi-modal language models can be used to improve interactions.
Author supplied keywords
Cite
CITATION STYLE
Janssens, R. (2024). Multi-modal Language Models for Human-Robot Interaction. In ACM/IEEE International Conference on Human-Robot Interaction (pp. 109–111). IEEE Computer Society. https://doi.org/10.1145/3610978.3638371
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.