Designing a tunable nested data-parallel programming system

2Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.

Abstract

This article describes Surge, a nested data-parallel programming system designed to simplify the porting and tuning of parallel applications to multiple target architectures. Surge decouples high-level specification of computations, expressed using a C++ programming interface, from low-level implementation details using two first-class constructs: schedules and policies. Schedules describe the valid ways in which data-parallel operators may be implemented, while policies encapsulate a set of parameters that govern platform-specific code generation. These two mechanisms are used to implement a code generation system that analyzes computations and automatically generates a search space of valid platform-specific implementations. An input and architecture-adaptive autotuning system then explores this search space to find optimized implementations. We express in Surge five real-world benchmarks from domains such as machine learning and sparse linear algebra and from the high-level specifications, Surge automatically generates CPU and GPU implementations that perform on par with or better than manually optimized versions.

Cite

CITATION STYLE

APA

Muralidharan, S., Garland, M., Sidelnik, A., & Hall, M. (2016). Designing a tunable nested data-parallel programming system. ACM Transactions on Architecture and Code Optimization, 13(4). https://doi.org/10.1145/3012011

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free