GeoCube: A spatio-temporal cube toward massive and multi-source EO data analysis

8Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

The volume of Earth Observation (EO) data has tremendously increased after the establishment of EO system. Managing such big EO data and turning them into valuable information is a major challenge in EO domain. This study proposes a multisource EO cube toward large-scale analysis. The infrastructure accommodates multisource geospatial data including raster and vector data. A cube model is designed, and four dimensions including product, space, time, and band dimension are formalized. Several cube explore examples are presented. The infrastructure enables large-scale analysis based on cloud computing technology, and a set of distributed cube objects extending Spark Resilient Distributed Dataset for cube tiles is designed. The distributed cube objects are compatible with multiple data source including raster and vector data. A multi-thread computing method is used together with cloud computing, which forms a hybrid parallelism, to further improve data access and processing efficiency. Batch computation is also used to address the issue that massive number of tiles cannot be loaded into memory at one time. Moreover, a machine learning-based approach is integrated into the cube to enhance parallel geoprocessing. The computational intensity of tiles can be predicted and saved in databases in advance, which eliminates the extra time cost of computational intensity prediction on the fly for those commonly used products. The design and implementation for the cube infrastructure, named GeoCube, is provided. It covers the ingestion and management of multisource geospatial data in the cube, the processing of geospatial/EO queries against different cube dimensions, and high-performance cube computing of large-scale geospatial datasets. The creation of such a geospatial data cube help advance the EO data cube approach while keeping connections to the data cube in the BI domain. The performance on data query and access, data processing, and load balance is presented. Results demonstrate the advantage of GeoCube infrastructure. Several applications are presented including cube OLAP operations, large-scale time-series analysis, and multisource data cube analysis. In conclusion, compared with existing cube approaches, the proposed infrastructure emphasizes the accommodation of multisource geospatial data including raster and vector data in the cube, cube tile processing with cloud computing, and artificial intelligence machine learning-enabled cube computation. Such a cube can inherit not only the large-scale processing capabilities of EO data cubes but also the data management capabilities of BI data cubes.

Cite

CITATION STYLE

APA

Gao, F., Yue, P., Jiang, L., Cao, Z., Liang, Z., Shangguan, B., … Zhao, S. (2022). GeoCube: A spatio-temporal cube toward massive and multi-source EO data analysis. National Remote Sensing Bulletin, 26(6), 1051–1066. https://doi.org/10.11834/jrs.20210566

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free