Data layout transformation for enhancing data locality on NUCA chip multiprocessors

  • Lu Q
  • Alias C
  • Bondhugula U
 et al. 
  • 15

    Readers

    Mendeley users who have this article in their library.
  • 37

    Citations

    Citations of this article.

Abstract

With increasing numbers of cores, future CMPs (chip multi-processors) are likely to have a tiled architecture with a portion of shared L2 cache on each tile and a bank-interleaved distribution of the address space. Although such an organization is effective for avoiding access hot-spots, it can cause a significant number of non-local L2 accesses for many commonly occurring regular data access patterns. In this paper we develop a compile-time framework for data locality optimization via data layout transformation. Using a polyhedral model, the program's localizability is determined by analysis of its index set and array reference functions, followed by non-canonical data layout transformation to reduce non-local accesses for localizable computations. Simulation-based results on a 16-core 2D tiled CMP demonstrate the effectiveness of the approach. The developed program transformation technique is also useful in several other data layout transformation contexts.

Author-supplied keywords

  • Data layout optimization
  • NUCA cache
  • Polyhedral model

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Get full text

Authors

  • Qingda Lu

  • Christophe Alias

  • Uday Bondhugula

  • Thomas Henretty

  • Sriram Krishnamoorthy

  • J. Ramanujam

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free