Scene classification for remote sensing images with self-attention augmented CNN

Zongyin Liu; Anming Dong; Jiguo Yu; Yubing Han; You Zhou; Kai Zhao

Journal ArticleOPEN ACCESS

Scene classification for remote sensing images with self-attention augmented CNN

IET Image Processing (2022) 16(11) 3085-3096

DOI: 10.1049/ipr2.12540

7Citations

6Readers

Abstract

Remote sensing scene classification aims to automatically assign a specific semantic label to each image. It is challenging to classify remote sensing scene images due to the images' diversity and rich spatial information. Recently, convolutional neural networks have been widely used to overcome these difficulties, such as the famous Visual Geometry Group (VGG) network. However, the VGG network with local receptive fields cannot model the global information of remote sensing images well. It also needs a large number of parameters and floating point operations to achieve satisfactory accuracy. To overcome these challenges, we introduce the self-attention mechanism to the VGG network. Specifically, we replace the last four convolutional layers in the VGG-19 network with two cascaded self-attention blocks, each consisting of two multi-head self-attention (MHSA) layers with the residual network structure. The new structure can simultaneously explore the local and global information from remote sensing scenes. Such improvements not only reduce model parameters but also improve the classification performance. The effectiveness of the proposed method is validated through experiments on four public data sets, i.e., NaSC-TG2, WHU-RS19, AID and EuroSAT.

Cite

CITATION STYLE

APA

Liu, Z., Dong, A., Yu, J., Han, Y., Zhou, Y., & Zhao, K. (2022). Scene classification for remote sensing images with self-attention augmented CNN. IET Image Processing, 16(11), 3085–3096. https://doi.org/10.1049/ipr2.12540

Scene classification for remote sensing images with self-attention augmented CNN

Abstract

Cite

Register to see more suggestions