An optimized parallel IDCT on graphics processing units

5Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper we present an implementation of the H.264/AVC Inverse Discrete Cosine Transform (IDCT) optimized for Graphics Processing Units (GPUs) using OpenCL. By exploiting that most of the input data of the IDCT for real videos are zero valued coefficients a new compacted data representation is created that allows for several optimizations. Experimental evaluations conducted on different GPUs show average speedups from 1.7x to 7.4x compared to an optimized single-threaded SIMD CPU version. © 2013 Springer-Verlag.

Cite

CITATION STYLE

APA

Wang, B., Alvarez-Mesa, M., Chi, C. C., & Juurlink, B. (2013). An optimized parallel IDCT on graphics processing units. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7640 LNCS, pp. 155–164). https://doi.org/10.1007/978-3-642-36949-0_18

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free