Data layout transformation for stencil computations on short-vector SIMD architectures

87Citations
Citations of this article
62Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Stencil computations are at the core of applications in many domains such as computational electromagnetics, image processing, and partial differential equation solvers used in a variety of scientific and engineering applications. Short-vector SIMD instruction sets such as SSE and VMX provide a promising and widely available avenue for enhancing performance on modern processors. However a fundamental memory stream alignment issue limits achieved performance with stencil computations on modern short SIMD architectures. In this paper, we propose a novel data layout transformation that avoids the stream alignment conflict, along with a static analysis technique for determining where this transformation is applicable. Significant performance increases are demonstrated for a variety of stencil codes on three modern SIMD-capable processors. © 2011 Springer-Verlag.

Cite

CITATION STYLE

APA

Henretty, T., Stock, K., Pouchet, L. N., Franchetti, F., Ramanujam, J., & Sadayappan, P. (2011). Data layout transformation for stencil computations on short-vector SIMD architectures. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6601 LNCS, pp. 225–245). https://doi.org/10.1007/978-3-642-19861-8_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free