Remote sensing semantic segmentation based on multimodal feature alignment and fusion

2Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

The accurate semantic segmentation of remote sensing data is of paramount importance to the success of geoscience research and applications. In comparison to traditional single-modal segmentation techniques, models based on multi-modal fusion have demonstrated superior performance and have been the subject of considerable attention in recent years. However, the majority of these models employ convolutional neural networks (CNNs) or visual transformers (ViTs) for fusion operations, which results in inadequate modelling and representation of local-global context. In this study, we propose a multi-layer multi-modal feature alignment and fusion scheme, designated as MFAFUNet, with the objective of providing a robust and effective multi-modal fusion backbone for semantic segmentation. The overarching algorithmic framework is analogous to that of the Unet model. First, the data in different modalities is aggregated and the image size is reduced through the use of multi-level downsampling modules based on the Haar wavelet transform. The high-frequency and low-frequency information of the features is extracted through a feature extraction module composed of a convolutional neural network (CNN) and a visual transformer (ViT). Second, through the semantic distribution alignment loss, the high-level features of different modal information are transformed into a common latent space, and their distributions are aligned to associate the complementary clues hidden in each modality. The effectiveness of the proposed method is demonstrated through experiments.

Cite

CITATION STYLE

APA

Chang, B., & Balz, T. (2025). Remote sensing semantic segmentation based on multimodal feature alignment and fusion. In International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences - ISPRS Archives (Vol. 48, pp. 1785–1790). International Society for Photogrammetry and Remote Sensing. https://doi.org/10.5194/isprs-archives-XLVIII-G-2025-1785-2025

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free