High Efficiency Video Coding (HEVC) doubles the compression rates over the previous H.264 standard for the same video quality. To improve the coding efficiency, HEVC adopts the hierarchical quadtree structured Coding Unit (CU). However, the computational complexity significantly increases due to the full search for Rate-Distortion Optimization (RDO) to find the optimal Coding Tree Unit (CTU) partition. Here, this paper proposes a deep learning model to predict the HEVC CU partition at inter-mode, instead of brute-force RDO search. To learn the learning model, a large-scale database for HEVC inter-mode is first built. Second, to predict the CU partition of HEVC, we propose as a model a combination of a Convolutional Neural Network (CNN) and a Long Short-Term Memory (LSTM) network. The simulation results prove that the proposed scheme can achieve a best compromise between complexity reduction and RD performance, compared to existing approaches.
CITATION STYLE
Bouaafia, S., Khemiri, R., Sayadi, F. E., Atri, M., & Liouane, N. (2020). A deep cnn-lstm framework for fast video coding. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12119 LNCS, pp. 205–212). Springer. https://doi.org/10.1007/978-3-030-51935-3_22
Mendeley helps you to discover research relevant for your work.