Tensor computations present significant performance challenges that impact a wide spectrum of applications. Efforts on improving the performance of tensor computations include exploring data layout, execution scheduling, and parallelism in common tensor kernels. This work presents a benchmark suite for arbitrary-order sparse tensor kernels using state-of-the-art tensor formats: coordinate (COO) and hierarchical coordinate (HiCOO). It demonstrates a set of reference tensor kernel implementations and some observations on Intel CPUs and NVIDIA GPUs. The full paper can be referred to at http://arxiv.org/abs/2001.00660.
CITATION STYLE
Li, J., Lakshminarasimhan, M., Wu, X., Li, A., Olschanowsky, C., & Barker, K. (2020). A parallel sparse tensor benchmark suite on CPUs and GPUs. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP (pp. 403–404). Association for Computing Machinery. https://doi.org/10.1145/3332466.3374513
Mendeley helps you to discover research relevant for your work.