Analysis and Implementation of 8x8 DCT using ARM NEON Assembly I. Analysis of Fast DCT, N=8 II. Implementation of 8x8 DCT Algorithm using ARM NEON Assembly on RaspberryPi2

  • Biswas R
  • Engineer S
N/ACitations
Citations of this article
15Readers
Mendeley users who have this article in their library.

Abstract

—The Discrete Cosine Transform, DCT forms a major backbone behind Image processing and Video Encoding/Decoding Applications. The DCT/IDCT Algorithm is a form of Similarity Transform. This paper tries to analyze and discuss the motivation behind the development of the Fast Discrete Cosine Transform Algorithm based on Chen, Fralick et al, 1977[2], and C. Loeffler, Ligtenberg's Practical Fast 1D DCT Algorithms, 1984[3]. Techniques of Matrix Decomposition based on Folding, Rotation Matrices and Jacobi Diagonalization have been used to analyze the Decomposition. Further, a proof of concept is presented in the form of a handwritten optimized, Assembly Language implementation in ARM NEON Assembly is presented. This greatly optimizes the performance and improves processing. This paper is an attempt to explain the usage in a lucid and effective language of computing.

Cite

CITATION STYLE

APA

Biswas, R., & Engineer, Sr. (2016). Analysis and Implementation of 8x8 DCT using ARM NEON Assembly I. Analysis of Fast DCT, N=8 II. Implementation of 8x8 DCT Algorithm using ARM NEON Assembly on RaspberryPi2. Signal Processing, Image Processing, 24.

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free