Exploring Temporal Dependencies in Multimodal Referring Expressions with Mixed Reality

Elena Sibirtseva; Ali Ghadirzadeh; Iolanda Leite; Mårten Björkman; Danica Kragic

Conference ProceedingsOPEN ACCESS

Exploring Temporal Dependencies in Multimodal Referring Expressions with Mixed Reality

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11575 LNCS 108-123

DOI: 10.1007/978-3-030-21565-1_8

N/ACitations

10Readers

Abstract

In collaborative tasks, people rely both on verbal and non-verbal cues simultaneously to communicate with each other. For human-robot interaction to run smoothly and naturally, a robot should be equipped with the ability to robustly disambiguate referring expressions. In this work, we propose a model that can disambiguate multimodal fetching requests using modalities such as head movements, hand gestures, and speech. We analysed the acquired data from mixed reality experiments and formulated a hypothesis that modelling temporal dependencies of events in these three modalities increases the model’s predictive power. We evaluated our model on a Bayesian framework to interpret referring expressions with and without exploiting the temporal prior.

Author supplied keywords

Cite

CITATION STYLE

APA

Sibirtseva, E., Ghadirzadeh, A., Leite, I., Björkman, M., & Kragic, D. (2019). Exploring Temporal Dependencies in Multimodal Referring Expressions with Mixed Reality. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11575 LNCS, pp. 108–123). Springer Verlag. https://doi.org/10.1007/978-3-030-21565-1_8

Exploring Temporal Dependencies in Multimodal Referring Expressions with Mixed Reality

Abstract

Author supplied keywords

Cite

Register to see more suggestions