MPJ Express is a messaging system that allows computational scientists to write and execute parallel Java applications on High Performance Computing (HPC) hardware. The software is capable of executing in two modes namely cluster and multicore modes. In the cluster mode, parallel applications execute in a typical cluster environment where multiple processing elements communicate with one another using a fast interconnect like Gigabit Ethernet or other proprietary networks like Myrinet and Infiniband. In this context, the MPJ Express library provides communication devices for Ethernet and Myrinet. In the multicore mode, the parallel Java application executes on a single system comprising of shared memory or multicore processors. In this paper, we extend the MPJ Express software to provide two new communication devices namely the native and hybrid device. The goal of the native communication device is to interface the MPJ Express software with native-typically written in C-MPI libraries. In this setting the bulk of messaging logic is offloaded to the underlying MPI library. This is attractive because MPJ Express can exploit latest features, like support for new interconnects and efficient collective communication algorithms of the native MPI library. The second device, called the hybrid device, is developed to allow efficient execution of parallel Java applications on clusters of shared memory or multicore processors. In this setting the MPJ Express runtime system runs a single multithreaded process on each node of the cluster-the number of threads in each process is equivalent to processing elements within a node. Our performance evaluation reveals that the native device allows MPJ Express to achieve comparable performance to native MPI libraries-for latency and bandwidth of point-to-point and collective communications-which is a significant gain in performance compared to existing communication devices. The hybrid communication device-without any modifications at application level-also helps parallel applications achieve better speedups and scalability. We witnessed comparative performance for various benchmarks-including NAS Parallel Benchmarks-with hybrid device as compared to the existing Ethernet communication device on a cluster of shared memory/multicore processors. © The Authors. Published by Elsevier B.V.
Qamar, B., Javed, A., Jameel, M., Shafi, A., & Carpenter, B. (2014). Design and implementation of hybrid and native communication devices for Java HPC. In Procedia Computer Science (Vol. 29, pp. 184–197). Elsevier. https://doi.org/10.1016/j.procs.2014.05.017