Cross-modal retrieval with discriminative dual-path CNN

Haoran Wang; Zhong Ji; Yanwei Pang

Conference Proceedings

Cross-modal retrieval with discriminative dual-path CNN

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11166 LNCS 384-394

DOI: 10.1007/978-3-030-00764-5_35

N/ACitations

2Readers

Get full text

Abstract

Cross-modal retrieval aims at searching semantically similar examples in one modality by using a query from another modality. Its typical applications including image-based text retrieval (IBTR) and text-based image retrieval (TBIR). Due to the rapid growth of multimodal data and the success of deep learning, cross-modal retrieval has received increasing attention and achieved significant progress in recent years. Dual-path CNN is a novel framework in this domain, which yields competitive performance by utilizing instance loss and inter-modal loss. However, it is still less discriminative in modeling the intra-modal relationship, which is also important in bridging a more discriminative cross-modal embedding network. To this end, we propose to incorporate an additional intra-modal loss into the framework to remedy this problem by preserving the intra-modal structure. Further, we develop a novel batch flexible sampling approach to train the entire network effectively and efficiently. Our approach, named Discriminative Dual-Path CNN (DDPC), achieves the state-of-the-art results on the MS-COCO dataset, improving IBTR by 4.9% and TBIR by 5.9% based on Recall@1 on the 5K test set.

Author supplied keywords

Cite

CITATION STYLE

APA

Wang, H., Ji, Z., & Pang, Y. (2018). Cross-modal retrieval with discriminative dual-path CNN. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11166 LNCS, pp. 384–394). Springer Verlag. https://doi.org/10.1007/978-3-030-00764-5_35

Cross-modal retrieval with discriminative dual-path CNN

Abstract

Author supplied keywords

Cite

Register to see more suggestions