Transform Network Architectures for Deep Learning Based End-to-End Image/Video Coding in Subsampled Color Spaces

12Citations
Citations of this article
27Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Most of the existing deep learning based end-to-end image/video coding (DLEC) architectures are designed for non-subsampled RGB color format. However, in order to achieve a superior coding performance, many state-of-the-art block-based compression standards such as High Efficiency Video Coding (HEVC/H.265) and Versatile Video Coding (VVC/H.266) are designed primarily for YUV 4:2:0 format, where U and V components are subsampled by considering the human visual system. This paper investigates various DLEC designs to support YUV 4:2:0 format by comparing their performance against the main profiles of HEVC and VVC standards under a common evaluation framework. Moreover, a new transform network architecture is proposed to improve the efficiency of coding YUV 4:2:0 data. The experimental results on YUV 4:2:0 datasets show that the proposed architecture significantly outperforms naive extensions of existing architectures designed for RGB format and achieves about 10% average BD-rate improvement over the intra-frame coding in HEVC.

Cite

CITATION STYLE

APA

Egilmez, H., Singh, A. K., Coban, M., Karczewicz, M., Zhu, Y., Yang, Y., … Cohen, T. (2021). Transform Network Architectures for Deep Learning Based End-to-End Image/Video Coding in Subsampled Color Spaces. IEEE Open Journal of Signal Processing, 2, 441–452. https://doi.org/10.1109/OJSP.2021.3092257

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free