An architecture for integrated near-data processors

Erik Vermij; Leandro Fiorin; Rik Jongerius; Christoph Hagleitner; Jan Van Lunteren; Koen Bertels

Journal ArticleOPEN ACCESS

An architecture for integrated near-data processors

ACM Transactions on Architecture and Code Optimization (2017) 14(3)

DOI: 10.1145/3127069

4Citations

18Readers

Abstract

To increase the performance of data-intensive applications, we present an extension to a CPU architecture that enables arbitrary near-data processing capabilities close to the main memory. This is realized by introducing a component attached to the CPU system-bus and a component at the memory side. Together they support hardware-managed coherence and virtual memory support to integrate the near-data processors in a shared-memory environment. We present an implementation of the components, as well as a systemsimulator, providing detailed performance estimations. With a variety of syntheticworkloadswe demonstrate the performance of the memory accesses, the mixed fine-And coarse-grained coherence mechanisms, and the near-data processor communication mechanism. Furthermore, we quantify the inevitable start-up penalty regarding coherence and data writeback, and argue that near-data processingworkloads should access data several times to offset this penalty. A case study based on the Graph500 benchmark confirms the small overhead for the proposed coherence mechanisms and shows the ability to outperform a real CPU by a factor of two.

Author supplied keywords

Cite

CITATION STYLE

APA

Vermij, E., Fiorin, L., Jongerius, R., Hagleitner, C., Van Lunteren, J., & Bertels, K. (2017). An architecture for integrated near-data processors. ACM Transactions on Architecture and Code Optimization, 14(3). https://doi.org/10.1145/3127069

An architecture for integrated near-data processors

Abstract

Author supplied keywords

Cite

Register to see more suggestions