This artice is free to access.
Analyses of large ensemble data on future climate are significantly useful for the probabilistic future projection of climate change in various interdisciplinary fields. However, the data volume of the Database for Policy Decision making for Future climate change or d4PDF, which is a mega-ensemble dataset, exceeds ∼ 3 PB, which is too large to download to local computers. To allow users for retrieve and downloading necessary data, we developed a user-friendly system called “System for Efficient content-based retrieval to Analyze Large volume climate data” (SEAL) under the Social Implementation Program on Climate Change Adaptation Technology (SI-CAT). Conventional web-based retrieval systems allow retrievals using metadata associated with a data file itself. In contrast, SEAL allows the users to retrieve the necessary data by using metadata associated with contents, such as physical values, of a data file. We confirmed that SEAL can reduce data sizes and total time required for obtaining necessary data to less than 0.5% and 1%, respectively, compared to conventional web-based retrieval systems.
Nakagawa, Y., Onoue, Y., Kawahara, S., Araki, F., Koyamada, K., Matsuoka, D., … Kawase, H. (2020). Development of a system for efficient content-based retrieval to analyze large volumes of climate data. Progress in Earth and Planetary Science, 7(1). https://doi.org/10.1186/s40645-019-0315-9