Learning discriminative features for visually similar classes is crucial for fine-grained image recognition tasks. Bilinear pooling models use the outer product of embedding features to enhance the representation capability and achieve favorable classification performance. However, these models cause exceedingly high dimensionality of features which makes them impractical for large-scale applications and may result in overfitting. This article proposes a feature correlation residual method to mine the channel and spatial correlation of embedding features without increasing the dimensionality of features. For this purpose, each channel/location of the embedding features in the residual module is determined by its channel/spatial correlation to all other channels/locations. Then, the correlation residual features are used to complement the original ones. In addition to cross entropy loss, batch nuclear norm loss and triplet loss based on the extracted features are used as regularization to alleviate overfitting, enlarge inter-class variations and reduce intra-class variations. Experimental results show that our method achieves state-of-the-art performances on some popular datasets for fine-grained image recognition.
CITATION STYLE
Xu, J., Wei, Y., & Deng, W. (2020). Feature Correlation Residual Network for Fine-Grained Image Recognition. IEEE Access, 8, 214322–214331. https://doi.org/10.1109/ACCESS.2020.3040857
Mendeley helps you to discover research relevant for your work.