Cache-Aware Roofline Model and Medical Image Processing Optimizations in GPUs

1Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

When optimizing or porting applications to new architectures, a preliminary characterization is necessary to exploit the maximum computing power of the employed devices. Profiling tools are available for numerous architectures and programming models, making it easier to spot possible bottlenecks. However, for a better interpretation of the collected results, current profilers rely on insightful performance models. In this paper, we describe the Cache Aware Roofline Model (CARM) and tools for its generation to enable the performance characterization of GPU architectures and workloads. We use CARM to characterize two kernels that are part of a 3D iterative reconstruction application for Computed Tomography (CT). These two kernels take most of the execution time of the whole method, being therefore suitable for a deeper analysis. By exploring the model and the methodology proposed, the overall performance of the kernels has been improved up to two times compared to the previous implementations.

Cite

CITATION STYLE

APA

Serrano, E., Ilic, A., Sousa, L., Garcia-Blas, J., & Carretero, J. (2018). Cache-Aware Roofline Model and Medical Image Processing Optimizations in GPUs. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11203 LNCS, pp. 509–526). Springer Verlag. https://doi.org/10.1007/978-3-030-02465-9_36

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free