Simple memory machine models for GPUs

Koji Nakano

Conference Proceedings

Simple memory machine models for GPUs

Nakano K

Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012 (2012) 794-803

DOI: 10.1109/IPDPSW.2012.98

N/ACitations

6Readers

Get full text

Abstract

The main contribution of this paper is to introduce two parallel memory machines, the Discrete Memory Machine (DMM) and the Unified Memory Machine (UMM). Unlike well studied theoretical parallel computational models such as PRAMs, these parallel memory machines are practical and capture the essential feature of memory access of NVIDIA GPUs. %Thus, algorithmic technique developed on the DMM and the UMM can be used %on the current GPUS. As a first step of the development of algorithmic techniques on the DMM and the UMM, we first evaluated the computing time for the contiguous access and the stride access to the memory on these models. We then go on to present parallel algorithms to transpose a two dimensional array on these models. Finally, we show that, for any given permutation, data in an array can be moved along a given permutation both on the DMM and on the UMM. Since the computing time of our permutation algorithms on the DMM and the UMM is equal to the sum of the lower bounds obtained from the memory bandwidth limitation and the latency overhead, they are optimal from the theoretical point of view. © 2012 IEEE.

Author supplied keywords

Cite

CITATION STYLE

APA

Nakano, K. (2012). Simple memory machine models for GPUs. In Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012 (pp. 794–803). https://doi.org/10.1109/IPDPSW.2012.98

Simple memory machine models for GPUs

Abstract

Author supplied keywords

Cite

Register to see more suggestions