Occamy: Elastically Sharing a SIMD Co-processor across Multiple CPU Cores

2Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

SIMD extensions are widely adopted in multi-core processors to exploit data-level parallelism. However, when co-running workloads on different cores, compute-intensive workloads cannot take advantage of the underutilized SIMD lanes allocated to memoryintensive workloads, reducing the overall performance. This paper proposes Occamy, a SIMD co-processor that can be shared by multiple CPU cores, so that their co-running workloads can spatially share its SIMD lanes. The key idea is to enable elastic spatial sharing by dynamically partitioning all the SIMD lanes across different workloads based on their phase behaviors, so that each workload may execute in variable-length SIMD mode. We also introduce an Occamy compiler to support such variable-length vectorization by analyzing such phase behaviors and generating the vectorized code that works with varying vector lengths. We demonstrate that Occamy can improve SIMD utilization, and consequently, performance over three representative SIMD architectures, with negligible chip area cost.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhang, Z., Ou, Y., Liu, Y., Wang, C., Zhou, Y., Wang, X., … Feng, X. (2023). Occamy: Elastically Sharing a SIMD Co-processor across Multiple CPU Cores. In International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS (Vol. 3, pp. 483–497). Association for Computing Machinery. https://doi.org/10.1145/3582016.3582046

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free