Multimodal artificial intelligence foundation models: Unleashing the power of remote sensing big data in earth observation

  • Hong D
  • Li C
  • Zhang B
  • et al.
N/ACitations
Citations of this article
26Readers
Mendeley users who have this article in their library.

Abstract

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Citation: Hong D., Li C., Zhang B., et al., (2024). Multimodal artificial intelligence foundation models: Unleashing the power of remote sensing big data in earth observation. The Innovation Geoscience 2(1): 100055. MULTIMODAL REMOTE SENSING BIG DATA Earth observation (EO) techniques have undergone rapid development, facilitating comprehensive measurement and monitoring of the Earth's various facets, including land surface, subsurface, air, and water quality, as well as the well-being of humans, plants, and animals. Among these techniques, remote sensing (RS) emerges as a pivotal contact-free method for EO. RS enables the extraction of relevant information regarding the physical properties of Earth and its environmental systems from space. The abundance of diverse RS information introduces the concept of multimodality. 1 For simplicity , multimodal data refers to the description of the same object through various pieces of information or properties, such as image, text, sound, social media data, and video, which enhances our ability to gain a comprehensive understanding of the Earth 2 through the integration of multiple perspectives, including but not limited to agriculture, forestry, ecology, and the urban domains. However, the escalating volume and diversity of RS data from various observation platforms, including spaceborne, airborne, and ground sources, underscores a pressing need to advance the multimodal processing and analysis capabilities of RS big data using artificial intelligence (AI) techniques. 3 This rapid expansion unavoidably introduces challenging difficulties, outlined as follows. • Existing models significantly fall short in terms of their capacity for information extraction and analysis. • Effectively harnessing and fully utilizing multimodal RS big data poses a significant bottleneck. • There is a notable deficiency in deep information mining and homoge-nization of applications. Geoscience Figure 1. A cycle-chain RS intelligent interpretation system enabled by multimodal AI foundation models for RS big data in EO Start from different observation platforms, acquire the multimodal RS big data, train well-designed multimodal AI foundation models, act on downstream EO applications, apply to clients in practice, and finally feedback to the validation and design of payloads and platforms.

Cite

CITATION STYLE

APA

Hong, D., Li, C., Zhang, B., Yokoya, N., Benediktsson, J. A., & Chanussot, J. (2024). Multimodal artificial intelligence foundation models: Unleashing the power of remote sensing big data in earth observation. The Innovation Geoscience, 2(1), 100055. https://doi.org/10.59717/j.xinn-geo.2024.100055

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free