CUDA introduces developers to a number of concepts (such as kernels, streams, warps and explicitly multi-level memory) beyond what they are used to in serial, parallel and multi-threaded applications. Visibility into these elements is critical for troubleshooting and tuning applications that make use of CUDA. This paper will highlight CUDA concepts implemented in CUDA 3.0-4.0, the complications they introduce for troubleshooting, and how TotalView helps the user deal with these new CUDA specific constructs. © Springer-Verlag Berlin Heidelberg 2012.
CITATION STYLE
Gottbrath, C., & Lüdtke, R. (2012). Debugging CUDA accelerated parallel applications with TotalView. In Proceedings of the 5th International Workshop on Parallel Tools for High Performance Computing 2011 (pp. 49–61). https://doi.org/10.1007/978-3-642-31476-6_5
Mendeley helps you to discover research relevant for your work.