Hypercubic storage layout and transforms in arbitrary dimensions using GPUs and CUDA

  • Hawick K
  • Playne D
  • 5

    Readers

    Mendeley users who have this article in their library.
  • 4

    Citations

    Citations of this article.

Abstract

Many simulations in the physical sciences are expressed in terms of
rectilinear arrays of variables. It is attractive to develop such
simulations for use in 1-, 2-, 3- or arbitrary physical dimensions and
also in a manner that supports exploitation of data-parallelism on fast
modern processing devices. We report on data layouts and transformation
algorithms that support both conventional and data-parallel memory
layouts. We present our implementations expressed in both conventional
serial C code as well as in NVIDIA's Compute Unified Device Architecture
concurrent programming language for use on general purpose graphical
processing units. We discuss: general memory layouts; specific
optimizations possible for dimensions that are powers-of-two and common
transformations, such as inverting, shifting and crinkling. We present
performance data for some illustrative scientific applications of these
layouts and transforms using several current GPU devices and discuss the
code and speed scalability of this approach. Copyright (C) 2010 John
Wiley & Sons, Ltd.

Author-supplied keywords

  • CUDA
  • GPUs
  • crinkling
  • data-parallelism
  • hypercubic indexing
  • shifting

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Authors

  • K. A. Hawick

  • D. P. Playne

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free