Transmuter: Bridging the efficiency gap using memory and dataflow reconfiguration

19Citations
Citations of this article
22Readers
Mendeley users who have this article in their library.
Get full text

Abstract

With the end of Dennard scaling and Moore's law, it is becoming increasingly difficult to build hardware for emerging applications thatmeet power and performance targets, while remaining flexible andprogrammable for end users. This is particularly true for domainsthat have frequently changing algorithms and applications involving mixed sparse/dense data structures, such as those in machinelearning and graph analytics. To overcome this, we present a flexibleaccelerator called Transmuter, in a novel effort to bridge the gap between General-Purpose Processors (GPPs) and Application-SpecificIntegrated Circuits (ASICs). Transmuter adapts to changing kernelcharacteristics, such as data reuse and control divergence, throughthe ability to reconfigure the on-chip memory type, resource sharingand dataflow at run-time within a short latency. This is facilitatedby a fabric of light-weight cores connected to a network of reconfigurable caches and crossbars. Transmuter addresses a rapidlygrowing set of algorithms exhibiting dynamic data movement patterns, irregularity, and sparsity, while delivering GPU-like efficiencies for traditional dense applications. Finally, in order to supportprogrammability and ease-of-adoption, we prototype a softwarestack composed of low-level runtime routines, and a high-levellanguage library called TransPy, that cater to expert programmersand end-users, respectively.Our evaluations with Transmuter demonstrate average throughput (energy-efficiency) improvements of 5.0× (18.4×) and 4.2× (4.0×)over a high-end CPU and GPU, respectively, across a diverse set ofkernels predominant in graph analytics, scientific computing andmachine learning. Transmuter achieves energy-efficiency gains averaging 3.4× and 2.0× over prior FPGA and CGRA implementationsof the same kernels, while remaining on average within 9.3× ofstate-of-the-art ASICs.

Cite

CITATION STYLE

APA

Pal, S., Feng, S., Park, D. H., Kim, S., Amarnath, A., Yang, C. S., … Dreslinski, R. (2020). Transmuter: Bridging the efficiency gap using memory and dataflow reconfiguration. In Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT (pp. 175–190). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1145/3410463.3414627

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free