Investigating read performance of python and netCDF when using HPC parallel filesystems

0Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

New methods need to be developed to handle the increasing size of data sets in atmospheric science - traditional analysis scripts often inefficiently read and process the data. NetCDF4 is a common file format used in atmospheric and ocean sciences, and Python is widely used in atmospheric and ocean science data analysis. The aim of this work is to provide insight into which read patterns and sizes are most effective when using the netCDF4-python library. Quantitative information on these would be useful information for scientists, library developers, and data managers. Three different read patterns were compared to simulate different types of reads: sequential, strided, and random, with each tested across three file systems - Panasas, Lustre, and GPFS. Read rate and standard deviation were measured using Python and C, reading from plain binary files and NetCDF4 files. Read performance for netCDF4-python was compared with the performance of native Python, the C NetCDF library, and the C Posix library. As expected, comparison between the different read modes shows that access pattern and read size significantly affect achieved performance. The results also show read performance profiles that are similar for the C, C NetCDF, and Python tests, however netCDF4-python performs less efficiently.

Cite

CITATION STYLE

APA

Jones, M., Blower, J., Lawrence, B., & Osprey, A. (2016). Investigating read performance of python and netCDF when using HPC parallel filesystems. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9945 LNCS, pp. 153–168). Springer Verlag. https://doi.org/10.1007/978-3-319-46079-6_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free