Abstract
Due to the significant cognition reduction, multi-media content has become an increasingly important information type nowadays. More and more descriptions are coupled with images to make them more attractive and persuasive. Currently, several text-image retrieval methods have been developed to improve the efficiency of the time-consuming and professional process. However, in practical retrieval applications, it is the vivid and terse descriptions that are widely used, instead of the shallow captions that describe what is contained. Therefore, the most existing methods designed for the caption-style text can not achieve this purpose. To eliminate the mismatch, we introduce a novel problem about description-image retrieval and propose the specially designed method, named Adapted Graph Reasoning and Filtration (AGRF). In AGRF, we firstly leverage an adapted graph reasoning network to discover the combination of visual objects in the image. Then, a cross-modal gate mechanism is proposed to cast aside those description-independent combinations. Experiment results on the real-world dataset demonstrate the advantages of the AGRF over the state-of-the-art methods.
Author supplied keywords
Cite
CITATION STYLE
Chen, S., Luo, Z., Gao, Y., Zhou, W., Li, C., & Chen, H. (2021). Adapted Graph Reasoning and Filtration for Description-Image Retrieval. In SIGIR 2021 - Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1839–1843). Association for Computing Machinery, Inc. https://doi.org/10.1145/3404835.3463047
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.