On the Efficiency of Python for High-Performance Computing: A Case Study Involving Stencil Updates for Partial Differential Equations

  • Langtangen H
  • Cai X
N/ACitations
Citations of this article
37Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The purpose of this paper is to assess the loss of computational efficiency that may occur when scientific codes are written in the Python programming language instead of Fortran or C. Our test problems concern the application of a seven-point finite stencil for a three-dimensional, variable coefficient, Laplace operator. This type of computation appears in lots of codes solving partial differential equations, and the variable coefficient is a key ingredient to capture the arithmetic complexity of stencils arising in advanced multi-physics problems in heterogeneous media. Different implementations of the stencil operation are described: pure Python loops over Python arrays, Psyco-acceleration of pure Python loops, vectorized loops (via shifted slice expressions), inline C++ code (via Weave), and migration of stencil loops to Fortran 77 (via F2py) and C. The performance of these implementations are compared against codes written entirely in Fortran 77 and C. We observe that decent performance is obtained with vectorization or migration of loops to compiled code. Vectorized loops run between two and five times slower than the pure Fortran and C codes. Mixed-language implementations, Python-Fortran and Python-C, where only the loops are implemented in Fortran or C, run at the same speed as the pure Fortran and C codes. At present, there are three alternative (and to some extent competing) implementations of Numerical Python: numpy, numarray, and Numeric. Our tests uncover significant performance differences between these three alternatives. Numeric is fastest on scalar operations with array indexing, while numpy is fastest on vectorized operations with array slices. We also present parallel versions of the stencil operations, where the loops are migrated to C for efficiency, and where the message passing statements are written in Python, using the high-level pypar interface to MPI. For the current test problems, there are hardly any efficiency loss by doing the message passing in Python. Moreover, adopting the Python interface of MPI gives a more elegant parallel implementation, both due to a simpler syntax of MPI calls and due to the efficient array slicing functionality that comes with Numerical Python.

Cite

CITATION STYLE

APA

Langtangen, H. P., & Cai, X. (2008). On the Efficiency of Python for High-Performance Computing: A Case Study Involving Stencil Updates for Partial Differential Equations. In Modeling, Simulation and Optimization of Complex Processes (pp. 337–357). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-79409-7_23

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free