Optimizing parallel reduction in CUDA

  • Harris M
  • Blelloch G
  • Maggs B
  • et al.
N/ACitations
Citations of this article
436Readers
Mendeley users who have this article in their library.

Abstract

Common and important data parallel primitive Easy to implement in CUDA Harder to get it right Serves as a great optimization example Well walk step by step through 7 different versions Demonstrates

Cite

CITATION STYLE

APA

Harris, M., Blelloch, G. E., Maggs, B. M., Govindaraju, N. K., Lloyd, B., Wang, W., … Margolin, L. G. (2007). Optimizing parallel reduction in CUDA. Proc. of ACM SIGMOD, 21, 13, 104–110.

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free