An architecture for integrated near-data processors

4Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.

Abstract

To increase the performance of data-intensive applications, we present an extension to a CPU architecture that enables arbitrary near-data processing capabilities close to the main memory. This is realized by introducing a component attached to the CPU system-bus and a component at the memory side. Together they support hardware-managed coherence and virtual memory support to integrate the near-data processors in a shared-memory environment. We present an implementation of the components, as well as a systemsimulator, providing detailed performance estimations. With a variety of syntheticworkloadswe demonstrate the performance of the memory accesses, the mixed fine-And coarse-grained coherence mechanisms, and the near-data processor communication mechanism. Furthermore, we quantify the inevitable start-up penalty regarding coherence and data writeback, and argue that near-data processingworkloads should access data several times to offset this penalty. A case study based on the Graph500 benchmark confirms the small overhead for the proposed coherence mechanisms and shows the ability to outperform a real CPU by a factor of two.

Cite

CITATION STYLE

APA

Vermij, E., Fiorin, L., Jongerius, R., Hagleitner, C., Van Lunteren, J., & Bertels, K. (2017). An architecture for integrated near-data processors. ACM Transactions on Architecture and Code Optimization, 14(3). https://doi.org/10.1145/3127069

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free