Many HPC applications have memory requirements that exceed the typical memory available on the compute nodes. While many HPC installations have resources with very large memory installed, a more portable solution for those applications is to implement an out-ofcore method. This out-of-core mechanism offloads part of the data typically onto disk when this data is not required. However, this presents a problem in parallel codes since the scalability of this approach is clearly limited by the disk latency and bandwidth. Moreover, in parallel file systems this design can lead to high loads of the file system and even failures. We present a library that provides the out-of-core functionality by making use of the main memory of devoted compute nodes. This library provides good performance and scalability and reduces the impact in the parallel file system by only using the local disk of each node. We have implemented an OpenSHMEM version of this library and compared the performance of this implementation with MPI. OpenSHMEM, together with other Partitioned Global Address Space approaches, represent one of the approaches for improving the performance of parallel applications towards the exascale. In this paper we show how OpenSHMEM represents an excellent approach for this type of application.
CITATION STYLE
Gómez-Iglesias, A., Vienne, J., Hamidouche, K., Simmons, C. S., Barth, W. L., & Panda, D. (2015). Scalable out-of-core OpenSHMEM library for HPC. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9397, pp. 138–153). Springer Verlag. https://doi.org/10.1007/978-3-319-26428-8_9
Mendeley helps you to discover research relevant for your work.