Multi-modal Language Models for Human-Robot Interaction

Ruben Janssens

Conference ProceedingsOPEN ACCESS

Multi-modal Language Models for Human-Robot Interaction

Janssens R

ACM/IEEE International Conference on Human-Robot Interaction (2024) 109-111

DOI: 10.1145/3610978.3638371

2Citations

14Readers

Get full text

Abstract

The recent progress in language models is enabling more flexible and natural conversation abilities for social robots. However, these language models were never made to be used in a physically embodied social agent. They lack the ability to process the other modalities humans use in conversation, such as vision, to make references to the environment and understand non-verbal communication. My work promotes the design of language models for physically embodied social interactions, shows how current technologies can be leveraged to enrich language models with these abilities, and explores how such multi-modal language models can be used to improve interactions.

Author supplied keywords

Cite

CITATION STYLE

APA

Janssens, R. (2024). Multi-modal Language Models for Human-Robot Interaction. In ACM/IEEE International Conference on Human-Robot Interaction (pp. 109–111). IEEE Computer Society. https://doi.org/10.1145/3610978.3638371

Multi-modal Language Models for Human-Robot Interaction

Abstract

Author supplied keywords

Cite

Register to see more suggestions