Data layout transformation for enhancing data locality on NUCA chip multiprocessors

  • Lu Q
  • Alias C
  • Bondhugula U
 et al. 
  • 15


    Mendeley users who have this article in their library.
  • 38


    Citations of this article.


With increasing numbers of cores, future CMPs (chip multi-processors) are likely to have a tiled architecture with a portion of shared L2 cache on each tile and a bank-interleaved distribution of the address space. Although such an organization is effective for avoiding access hot-spots, it can cause a significant number of non-local L2 accesses for many commonly occurring regular data access patterns. In this paper we develop a compile-time framework for data locality optimization via data layout transformation. Using a polyhedral model, the program's localizability is determined by analysis of its index set and array reference functions, followed by non-canonical data layout transformation to reduce non-local accesses for localizable computations. Simulation-based results on a 16-core 2D tiled CMP demonstrate the effectiveness of the approach. The developed program transformation technique is also useful in several other data layout transformation contexts.

Author-supplied keywords

  • Data layout optimization
  • NUCA cache
  • Polyhedral model

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document


  • Qingda Lu

  • Christophe Alias

  • Uday Bondhugula

  • Thomas Henretty

  • Sriram Krishnamoorthy

  • J. Ramanujam

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free