Commodity processors are comprised of several CPU cores and one integrated GPU. To fully exploit this type of architectures, one needs to automatically determine how to partition the workload between both devices. This is specially challenging for irregular workloads, where each iteration's work is data dependent and shows control and memory divergence. In this paper, we present a novel adaptive partitioning strategy specially designed for irregular applications running on heterogeneous CPU-GPU chips. The main novelty of this work is that the size of the workload assigned to the GPU and CPU adapts dynamically to maximize the GPU and CPU utilization while balancing the workload among the devices. Our experimental results on an Intel Haswell architecture using a set of irregular benchmarks show that our approach outperforms exhaustive static and adaptive state-of-the-art approaches in terms of performance and energy consumption.
Vilches, A., Asenjo, R., Navarro, A., Corbera, F., Gran, R., & Garzarán, M. (2015). Adaptive partitioning for irregular applications on heterogeneous CPU-GPU chips. In Procedia Computer Science (Vol. 51, pp. 140–149). Elsevier B.V. https://doi.org/10.1016/j.procs.2015.05.213