Engineering a multi-core radix sort

Jan Wassenberg; Peter Sanders

Conference ProceedingsOPEN ACCESS

Engineering a multi-core radix sort

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 6853 LNCS(PART 2) 160-169

DOI: 10.1007/978-3-642-23397-5_16

38Citations

35Readers

Abstract

We present a fast radix sorting algorithm that builds upon a microarchitecture-aware variant of counting sort. Taking advantage of virtual memory and making use of write-combining yields a per-pass throughput corresponding to at least 89% of the system's peak memory bandwidth. Our implementation outperforms Intel's recently published radix sort by a factor of 1.64. It also compares favorably to the reported performance of an algorithm for Fermi GPUs when data-transfer overhead is included. These results indicate that scalar, bandwidth-sensitive sorting algorithms remain competitive on current architectures. Various other memory-intensive applications can benefit from the techniques described herein. © 2011 Springer-Verlag.

Cite

CITATION STYLE

APA

Wassenberg, J., & Sanders, P. (2011). Engineering a multi-core radix sort. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6853 LNCS, pp. 160–169). https://doi.org/10.1007/978-3-642-23397-5_16

Engineering a multi-core radix sort

Abstract

Cite

Register to see more suggestions