E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context

Zizhang Li; Mengmeng Wang; Huaijin Pi; Kechun Xu; Jianbiao Mei; Yong Liu

Conference Proceedings

E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2022) 13695 LNCS 267-284

DOI: 10.1007/978-3-031-19833-5_16

8Citations

27Readers

Get full text

Abstract

Recently, the image-wise implicit neural representation of videos, NeRV, has gained popularity for its promising results and swift speed compared to regular pixel-wise implicit representations. However, the redundant parameters within the network structure can cause a large model size when scaling up for desirable performance. The key reason of this phenomenon is the coupled formulation of NeRV, which outputs the spatial and temporal information of video frames directly from the frame index input. In this paper, we propose E-NeRV, which dramatically expedites NeRV by decomposing the image-wise implicit neural representation into separate spatial and temporal context. Under the guidance of this new formulation, our model greatly reduces the redundant model parameters, while retaining the representation ability. We experimentally find that our method can improve the performance to a large extent with fewer parameters, resulting in a more than 8 × faster speed on convergence. Code is available at https://github.com/kyleleey/E-NeRV.

Author supplied keywords

Cite

CITATION STYLE

APA

Li, Z., Wang, M., Pi, H., Xu, K., Mei, J., & Liu, Y. (2022). E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13695 LNCS, pp. 267–284). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-19833-5_16

E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context

Abstract

Author supplied keywords

Cite

Register to see more suggestions