Speeding up mutual information computation using NVIDIA CUDA hardware

  • Shams R
  • Barnes N
  • 65


    Mendeley users who have this article in their library.
  • 43


    Citations of this article.


We present an efficient method for mutual information (MI) computation between images (2D or 3D) for NVIDIA’s ‘compute unified device architecture’ (CUDA) compatible devices. Efficient parallelization of MI is particularly challenging on a ‘graphics processor unit’ (GPU) due to the need for histogram-based calculation of joint and marginal probability mass functions (pmfs) with large number of bins. The data-dependent (unpredictable) nature of the updates to the histogram, together with hardware limitations of the GPU (lack of synchronization primitives and limited memory caching mechanisms) can make GPU-based computation inefficient. To overcome these limitation, we approximate the pmfs, using a down-sampled version of the jointhistogram which avoids memory update problems. Our CUDA implementation improves the efficiency of MI calculations by a factor of 25 compared to a standard CPUbased implementation and can be used in MI-based image registration applications.

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document


  • Ramtin Shams

  • Nick Barnes

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free