Memory copies in messaging systems can be a major source of performance degradation in cluster computing. In this paper we discuss a system which can offload a host CPU from most of the overhead of copying data between distinct regions in the host physical memory. The sistem is implemented as a special-purpose Linux device driver operating a generic, non-programmable Gigabit Ethernet adapter connected to itself. Whenever the descriptor-based DMA engines of the adapter are instructed to start a data communication, the data are read from the host memory and written to the memory itself thanks to the loopback cable; this is semantically equivalent to a non-blocking memory copy operation performed by the two DMA engines. Suitable completion test/waiting routines are also implemented, in order to provide traditional, blocking semantics in a split-phase fashion. An implementation of MPI using this system in place of traditional memcpy() calls on receive shows a significantly lower receive overhead. © Springer-Verlag Berlin Heidelberg 2003.
CITATION STYLE
Ciaccio, G. (2003). Using a self-connected Gigabit Ethernet adapter as a memcpy() low-overhead engine for MPI. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2840, 247–256. https://doi.org/10.1007/978-3-540-39924-7_37
Mendeley helps you to discover research relevant for your work.