The authors have designed a radix sort algorithm for vector multiprocessors and have implemented the algorithm on the CRAY Y-MP. On one processor of the Y-MP the sort is over five times faster on large sorting problems than the optimized library sort provided by CRAY Research. On eight processors, an additional speedup of almost five is achieved, yielding a routine over 25 times faster than the library sort. Using the multiprocessor version, one can sort at a rate of 15 million 64-bit keys per second. This sorting algorithm is adapted from a data-parallel algorithm previously designed for the Connection Machine CM-2. To develop their version, the authors introduce three general techniques for mapping data-parallel algorithms onto vector multiprocessors. These techniques allow one to fully vectorize and parallelize the algorithm. The authors also derive equations that model the performance of the algorithm on the Y-MP. These equations are then used to optimize the radix size.
CITATION STYLE
Zagha, M., & Blelloch, G. E. (1991). Radix sort for vector multiprocessors (pp. 712–721). Publ by IEEE. https://doi.org/10.1145/125826.126164
Mendeley helps you to discover research relevant for your work.