Exploring Temporal Dependencies in Multimodal Referring Expressions with Mixed Reality

N/ACitations
Citations of this article
10Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In collaborative tasks, people rely both on verbal and non-verbal cues simultaneously to communicate with each other. For human-robot interaction to run smoothly and naturally, a robot should be equipped with the ability to robustly disambiguate referring expressions. In this work, we propose a model that can disambiguate multimodal fetching requests using modalities such as head movements, hand gestures, and speech. We analysed the acquired data from mixed reality experiments and formulated a hypothesis that modelling temporal dependencies of events in these three modalities increases the model’s predictive power. We evaluated our model on a Bayesian framework to interpret referring expressions with and without exploiting the temporal prior.

Cite

CITATION STYLE

APA

Sibirtseva, E., Ghadirzadeh, A., Leite, I., Björkman, M., & Kragic, D. (2019). Exploring Temporal Dependencies in Multimodal Referring Expressions with Mixed Reality. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11575 LNCS, pp. 108–123). Springer Verlag. https://doi.org/10.1007/978-3-030-21565-1_8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free