Automated tuning in parallel sorting on multi-core architectures

Haibo Lin; Chao Li; Qian Wang; Yi Zhao; Ninghe Pan; Xiaotong Zhuang; Ling Shao

Conference ProceedingsOPEN ACCESS

Automated tuning in parallel sorting on multi-core architectures

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2010) 6271 LNCS(PART 1) 14-25

DOI: 10.1007/978-3-642-15277-1_3

5Citations

10Readers

Abstract

Empirical search is an emerging strategy used in systems like ATLAS, FFTW and SPIRAL to find the parameter values of the implementation that deliver near-optimal performance for a particular machine. However, this approach has only proven successful for scientific kernels or serial symbolic sorting. Even commercial libraries like Intel MKL or IBM ESSL do not include parallel version of sorting routines. In this paper we study empirical search in the generation of parallel sorting routines for multi-core systems. Parallel sorting presents new challenges that the relative performance of the algorithms depends not only on the characteristics of the architectures and input data, but also on the data partitioning schemes and thread interactions. We have studied parallel sorting algorithms including quick sort, cache-conscious radix sort, multiway merge sort, sample sort and quick-radix sort, and have built a sorting library using empirical search and artificial neural network. Our results show that this sorting library could generate the best parallel sorting algorithms for different input sets on both x86 and SPARC multi-core architectures, with a peak speedup of 2.2x and 3.9x, respectively. © 2010 Springer-Verlag.

Cite

CITATION STYLE

APA

Lin, H., Li, C., Wang, Q., Zhao, Y., Pan, N., Zhuang, X., & Shao, L. (2010). Automated tuning in parallel sorting on multi-core architectures. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6271 LNCS, pp. 14–25). https://doi.org/10.1007/978-3-642-15277-1_3

Automated tuning in parallel sorting on multi-core architectures

Abstract

Cite

Register to see more suggestions