Towards a foundation model for geospatial artificial intelligence (vision paper)

33Citations
Citations of this article
60Readers
Mendeley users who have this article in their library.

Abstract

Large pre-trained models, also known as foundation models (FMs), are trained in a task-agnostic manner on large-scale data and can be adapted to a wide range of downstream tasks by fine tuning, few-shot, or even zero-shot learning. Despite their successes in language and vision tasks, we have yet to see an attempt to develop foundation models for geospatial artificial intelligence (GeoAI). In this work, we explore the promises and challenges for developing multimodal foundation models for GeoAI. We first show the advantages of this idea by testing the performance of existing Large pre-trained Language Models (LLMs) (e.g. GPT-2 and GPT-3) on two geospatial semantics tasks. Results indicate that these task-agnostic LLMs can outperform task-specific fully-supervised models on both tasks with 2-9% improvement in a few-shot learning setting. However, we also show the limitations of these existing foundation models given the multimodality nature of GeoAI, especially when dealing with geometries in conjunction with other modalities. So we discuss the possibility of a multimodal foundation model which can reason over various types of geospatial data through geospatial alignments. We conclude this paper by discussing the unique risks and challenges to develop such model for GeoAI.

Cite

CITATION STYLE

APA

Mai, G., Cundy, C., Choi, K., Hu, Y., Lao, N., & Ermon, S. (2022). Towards a foundation model for geospatial artificial intelligence (vision paper). In GIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems. Association for Computing Machinery. https://doi.org/10.1145/3557915.3561043

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free