Performance measurement for the openmp 4.0 offloading model

6Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

OpenMP is one of the most widely used standards for enabling threadlevel parallelism in high performance computing codes. The recently released version 4.0 of the specification introduces directives that enable application developers to offload portions of the computation to massively-parallel target devices. However, to efficiently utilize these devices, sophisticated performance analysis tools are required. The emerging OpenMP Tools Interface (OMPT) aids the development of portable tools, but currently lacks the support for OpenMP 4.0 target directives. This paper presents a novel approach to measure the performance of applications utilizing OpenMP offloading. It introduces libmpti, an OMPT-based measurement library for Intel MIC target devices. For host-side analysis we extended the OPARI2 instrumenter and prototypically integrated the complete approach into the state-of-the-art tool infrastructure Score-P. We demonstrate the effectiveness of the presented method and implementation with a Conjugate-Gradient (CG) kernel on an Intel Xeon Phi coprocessor. Finally, we visualize the obtained performance data with Vampir.

Cite

CITATION STYLE

APA

Dietrich, R., Schmitt, F., Grund, A., & Schmidl, D. (2014). Performance measurement for the openmp 4.0 offloading model. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8806, pp. 291–301). Springer Verlag. https://doi.org/10.1007/978-3-319-14313-2_25

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free