Building extraction from very high-resolution remote sensing images is a fundamental task and is widely used in applications, such as change detection, disaster assessment, and real-time update of geographic information databases. However, due to the complexity of the geographical environment and the diversity of target features, accurate automatic building extraction remains very challenging. With the fast development of deep learning techniques, convolutional neural networks (CNN) have been widely used in remote sensing research and have achieved considerable results. But for large urban area-based building detection tasks, the CNN-based method usually gets into local optima and generates many false positive detections around building boundaries. To avoid the local optima and be aware of nonlocal information, this article proposes a hybrid feature extraction model based on the combination of the CNN and Transformer to realize the automatic building detection from very high-resolution remote sensing images. Meanwhile, a multiconstraint weighting mechanism is proposed to enhance the ability of the model to recognize the regular geometric boundaries of buildings. Comprehensive experiments are conducted on the three different datasets. The proposed MC-TRANSU achieves the best F1-score and intersection over union, compared with the state-of-the-art methods, such as SegNet, TransUnet, and Swin-Unet, and the detection accuracy improved around 5%. Quantitative and qualitative results verify the superiority and effectiveness of our model.
CITATION STYLE
Yuan, W., Ran, W., Shi, X., & Shibasaki, R. (2023). Multiconstraint Transformer-Based Automatic Building Extraction from High-Resolution Remote Sensing Images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 16, 9164–9174. https://doi.org/10.1109/JSTARS.2023.3319826
Mendeley helps you to discover research relevant for your work.