Field-of-view prediction in 360-degree videos with attention-based neural encoder-decoder networks

Jiang Yu; Yong Liu

Conference ProceedingsOPEN ACCESS

Field-of-view prediction in 360-degree videos with attention-based neural encoder-decoder networks

Proceedings of the 11th ACM Workshop on Immersive Mixed and Virtual Environment Systems, MMVE 2019 (2019) 37-42

DOI: 10.1145/3304113.3326118

11Citations

17Readers

Abstract

In this paper, we propose attention-based neural encoder-decoder networks for predicting user Field-of-View (FoV) in 360-degree videos. Our proposed prediction methods are based on the attention mechanism that learns the weighted prediction power of historical FoV time series through end-to-end training. Attention-based neural encoder-decoder networks do not involve recursion, thus can be highly parallelized during training. Using publicly available 360-degree head movement datasets, we demonstrate that our FoV prediction models outperform the state-of-art FoV prediction models, achieving lower prediction error, higher training throughput, and faster convergence. Better FoV prediction leads to reduced bandwidth consumption, better video quality, and improved user quality of experience.

Author supplied keywords

Cite

CITATION STYLE

APA

Yu, J., & Liu, Y. (2019). Field-of-view prediction in 360-degree videos with attention-based neural encoder-decoder networks. In Proceedings of the 11th ACM Workshop on Immersive Mixed and Virtual Environment Systems, MMVE 2019 (pp. 37–42). Association for Computing Machinery, Inc. https://doi.org/10.1145/3304113.3326118

Field-of-view prediction in 360-degree videos with attention-based neural encoder-decoder networks

Abstract

Author supplied keywords

Cite

Register to see more suggestions