Inferring human attention by learning latent intentions

18Citations
Citations of this article
26Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper addresses the problem of inferring 3D human attention in RGB-D videos at scene scale. 3D human attention describes where a human is looking in 3D scenes. We propose a probabilistic method to jointly model attention, intentions, and their interactions. Latent intentions guide human attention which conversely reveals the intention features. This mutual interaction makes attention inference a joint optimization with latent intentions. An EM-based approach is adopted to learn the latent intentions and model parameters. Given an RGB-D video with 3D human skeletons, a jointstate dynamic programming algorithm is utilized to jointly infer the latent intentions, the 3D attention directions, and the attention voxels in scene point clouds. Experiments on a new 3D human attention dataset prove the strength of our method.

Cite

CITATION STYLE

APA

Wei, P., Xie, D., Zheng, N., & Zhu, S. C. (2017). Inferring human attention by learning latent intentions. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 0, pp. 1297–1303). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2017/180

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free