Multi-label image classification is an important and challenging task in computer vision and multimedia fields. Most of the recent works only capture the pair-wise dependencies among multiple labels through statistical co-occurrence information, which cannot model the high-order semantic relations automatically. In this paper, we propose a high-order semantic learning model based on adaptive hypergraph neural networks (AdaHGNN) to boost multi-label classification performance. Firstly, an adaptive hypergraph is constructed by using label embeddings automatically. Secondly, image features are decoupled into feature vectors corresponding to each label, and hypergraph neural networks (HGNN) are employed to correlate these vectors and explore the high-order semantic interactions. In addition, multi-scale learning is used to reduce sensitivity to object size inconsistencies. Experiments are conducted on four benchmarks: MS-COCO, NUS-WIDE, Visual Genome, and Pascal VOC 2007, which cover large, medium, and small-scale categories. State-of-the-art performances are achieved on three of them. Results and analysis demonstrate that the proposed method has the ability to capture high-order semantic dependencies.
CITATION STYLE
Wu, X., Chen, Q., Li, W., Xiao, Y., & Hu, B. (2020). AdaHGNN: Adaptive Hypergraph Neural Networks for Multi-Label Image Classification. In MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia (pp. 284–293). Association for Computing Machinery, Inc. https://doi.org/10.1145/3394171.3414046
Mendeley helps you to discover research relevant for your work.