Compressed self-attention for deep metric learning

5Citations
Citations of this article
37Readers
Mendeley users who have this article in their library.

Abstract

In this paper, we aim to enhance self-attention (SA) mechanism for deep metric learning in visual perception, by capturing richer contextual dependencies in visual data. To this end, we propose a novel module, named compressed selfattention (CSA), which significantly reduces the computation and memory cost with a neglectable decrease in accuracy with respect to the original SA mechanism, thanks to the following two characteristics: I) it only needs to compute a small number of base attention maps for a small number of base feature vectors; and ii) the output at each spatial location can be simply obtained by an adaptive weighted average of the outputs calculated from the base attention maps. The high computational efficiency of CSA enables the application to high-resolution shallow layers in convolutional neural networks with little additional cost. In addition, CSA makes it practical to further partition the feature maps into groups along the channel dimension and compute attention maps for features in each group separately, thus increasing the diversity of long-range dependencies and accordingly boosting the accuracy. We evaluate the performance of CSA via extensive experiments on two metric learning tasks: Person re-identification and local descriptor learning. Qualitative and quantitative comparisons with latest methods demonstrate the significance of CSA in this topic.

Cite

CITATION STYLE

APA

Chen, Z., Gong, M., Xu, Y., Wang, C., Zhang, K., & Du, B. (2020). Compressed self-attention for deep metric learning. In AAAI 2020 - 34th AAAI Conference on Artificial Intelligence (pp. 3561–3568). AAAI press. https://doi.org/10.1609/aaai.v34i04.5762

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free