X-AWARE: ConteXt-AWARE Human-Environment Attention Fusion for Driver Gaze Prediction in the Wild

Lukas Stappen; Georgios Rizos; Björn Schuller

Conference ProceedingsOPEN ACCESS

X-AWARE: ConteXt-AWARE Human-Environment Attention Fusion for Driver Gaze Prediction in the Wild

ICMI 2020 - Proceedings of the 2020 International Conference on Multimodal Interaction (2020) 858-867

DOI: 10.1145/3382507.3417967

22Citations

19Readers

Get full text

Abstract

Reliable systems for automatic estimation of the driver's gaze are crucial for reducing the number of traffic fatalities and for many emerging research areas aimed at developing intelligent vehicle-passenger systems. Gaze estimation is a challenging task, especially in environments with varying illumination and reflection properties. Furthermore, there is wide diversity with respect to the appearance of drivers' faces, both in terms of occlusions (e.g. vision aids) and cultural/ethnic backgrounds. For this reason, analysing the face along with contextual information - for example, the vehicle cabin environment - adds another, less subjective signal towards the design of robust systems for passenger gaze estimation. In this paper, we present an integrated approach to jointly model different features for this task. In particular, to improve the fusion of the visually captured environment with the driver's face, we have developed a contextual attention mechanism, X-AWARE, attached directly to the output convolutional layers of InceptionResNetV2 networks. In order to showcase the effectiveness of our approach, we use the Driver Gaze in the Wild dataset, recently released as part of the Eighth Emotion Recognition in the Wild Challenge (EmotiW) challenge. Our best model outperforms the baseline by an absolute of 15.03% in accuracy on the validation set, and improves the previously best reported result by an absolute of 8.72% on the test set.

Author supplied keywords

Cite

CITATION STYLE

APA

Stappen, L., Rizos, G., & Schuller, B. (2020). X-AWARE: ConteXt-AWARE Human-Environment Attention Fusion for Driver Gaze Prediction in the Wild. In ICMI 2020 - Proceedings of the 2020 International Conference on Multimodal Interaction (pp. 858–867). Association for Computing Machinery, Inc. https://doi.org/10.1145/3382507.3417967

X-AWARE: ConteXt-AWARE Human-Environment Attention Fusion for Driver Gaze Prediction in the Wild

Abstract

Author supplied keywords

Cite

Register to see more suggestions