Field-of-view prediction in 360-degree videos with attention-based neural encoder-decoder networks

11Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.

Abstract

In this paper, we propose attention-based neural encoder-decoder networks for predicting user Field-of-View (FoV) in 360-degree videos. Our proposed prediction methods are based on the attention mechanism that learns the weighted prediction power of historical FoV time series through end-to-end training. Attention-based neural encoder-decoder networks do not involve recursion, thus can be highly parallelized during training. Using publicly available 360-degree head movement datasets, we demonstrate that our FoV prediction models outperform the state-of-art FoV prediction models, achieving lower prediction error, higher training throughput, and faster convergence. Better FoV prediction leads to reduced bandwidth consumption, better video quality, and improved user quality of experience.

Cite

CITATION STYLE

APA

Yu, J., & Liu, Y. (2019). Field-of-view prediction in 360-degree videos with attention-based neural encoder-decoder networks. In Proceedings of the 11th ACM Workshop on Immersive Mixed and Virtual Environment Systems, MMVE 2019 (pp. 37–42). Association for Computing Machinery, Inc. https://doi.org/10.1145/3304113.3326118

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free