Implementation of non local means filter in GPUs

Adrián Márques; Alvaro Pardo

Conference ProceedingsOPEN ACCESS

Implementation of non local means filter in GPUs

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 8258 LNCS(PART 1) 407-414

DOI: 10.1007/978-3-642-41822-8_51

14Citations

6Readers

Abstract

In this paper, we review some alternatives to reduce the computational complexity of the Non-Local Means image filter and present a CUDA-based implementation of it for GPUs, comparing its performance on different GPUs and with respect to reference CPU implementations. Starting from a naive CUDA implementation, we describe different aspects of CUDA and the algorithm itself that can be leveraged to decrease the execution time. Our GPU implementation achieved speedups of up to 35.8x with respect to our reduced-complexity reference implementation on the CPU, and more than 700x over a plain CPU implementation. © Springer-Verlag 2013.

Author supplied keywords

Cite

CITATION STYLE

APA

Márques, A., & Pardo, A. (2013). Implementation of non local means filter in GPUs. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8258 LNCS, pp. 407–414). https://doi.org/10.1007/978-3-642-41822-8_51

Implementation of non local means filter in GPUs

Abstract

Author supplied keywords

Cite

Register to see more suggestions