This paper proposes a system-wide communication library to couple multiple MPI programs for heterogeneous coupling computing called h3-Open-SYS/WaitIO-Socket (WaitIO-Socket for short). WaitIO-Socket provides an inter-program communication environment among MPI programs and supports different MPI libraries with various interconnects and processor types. We have developed the WaitIO-Socket communication library and tested it on the Wisteria/BDEC-01 supercomputing system, including Odyssey (Fujitsu A64FX-aarch64/Fujitsu-MPI/Tofu) and Aquarius (Intel Xeon-x86_64+NVIDIA-A100/Intel MPI/InfiniBand). As a result of the evaluation, WaitIO-Socket can execute large-scale programs on the Wisteria system, our first target system. The Odyssey and Aquarius MPI programs are able to communicate using WaitIO-Socket and achieve 53.2 GB/s using multiple streams throughout the system. We also show that the application NICAM/ADA is able to run with the h3-Open-UTIL/MP coupler 35% faster on the combination of Odyssey with Arm CPU and Aquarius with NVIDIA GPU than Odyssey with Arm CPU.
CITATION STYLE
Sumimoto, S., Arakawa, T., Sakaguchi, Y., Matsuba, H., Yashiro, H., Hanawa, T., & Nakajima, K. (2023). A System-Wide Communication to Couple Multiple MPI Programs for Heterogeneous Computing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13798 LNCS, pp. 314–327). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-29927-8_25
Mendeley helps you to discover research relevant for your work.