Exposing tunable parameters in multi-threaded numerical code

2Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Achieving high performance on today's architectures requires careful orchestration of many optimization parameters. In particular, the presence of shared-caches on multicore architectures makes it necessary to consider, in concert, issues related to both parallelism and data locality. This paper presents a systematic and extensive exploration of thecombined search space of transformation parameters that affect both parallelism and data locality in multi-threaded numerical applications.We characterize the nature of the complex interaction between blocking, problem decomposition and selection of loops for parallelism. We identify key parameters for tuning and provide an automatic mechanism for exposing these parameters to a search tool. A series of experiments on two scientific benchmarks illustrates the non-orthogonality of the transformation search space and reiterates the need for integrated transformation heuristics for achieving high-performance on current multicore architectures. © 2010 Springer-Verlag.

Cite

CITATION STYLE

APA

Qasem, A., Guo, J., Rahman, F., & Yi, Q. (2010). Exposing tunable parameters in multi-threaded numerical code. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6289 LNCS, pp. 46–60). https://doi.org/10.1007/978-3-642-15672-4_6

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free