A multi-GPU implementation of a D2Q37 Lattice Boltzmann code

Luca Biferale; Filippo Mantovani; Marcello Pivanti; Fabio Pozzati; Mauro Sbragaglia; Andrea Scagliarini; Sebastiano Fabio Schifano; Federico Toschi; Raffaele Tripiccione

Conference Proceedings

A multi-GPU implementation of a D2Q37 Lattice Boltzmann code

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7203 LNCS(PART 1) 640-650

DOI: 10.1007/978-3-642-31464-3_65

18Citations

9Readers

Get full text

Abstract

We describe a parallel implementation of a compressible Lattice Boltzmann code on a multi-GPU cluster based on Nvidia Fermi processors. We analyze how to optimize the algorithm for GP-GPU architectures, describe the implementation choices that we have adopted and compare our performance results with an implementation optimized for latest generation multi-core CPUs. Our program runs at ≈ 30% of the double-precision peak performance of one GPU and shows almost linear scaling when run on the multi-GPU cluster. © 2012 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Biferale, L., Mantovani, F., Pivanti, M., Pozzati, F., Sbragaglia, M., Scagliarini, A., … Tripiccione, R. (2012). A multi-GPU implementation of a D2Q37 Lattice Boltzmann code. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7203 LNCS, pp. 640–650). https://doi.org/10.1007/978-3-642-31464-3_65

A multi-GPU implementation of a D2Q37 Lattice Boltzmann code

Abstract

Author supplied keywords

Cite

Register to see more suggestions