Abstract
Recently, deep hashing methods have attracted much attention in multimedia retrieval task. Some of them can even perform cross-modal retrieval. However, almost all existing deep cross-modal hashing methods are pairwise optimizing methods, which means that they become time-consuming if they are extended to large scale datasets. In this paper, we propose a novel tri-stage deep cross-modal hashing method - Dual Deep Neural Networks Cross-Modal Hashing, i.e., DDCMH, which employs two deep networks to generate hash codes for different modalities. Specifically, in Stage 1, it leverages a single-modal hashing method to generate the initial binary codes of textual modality of training samples; in Stage 2, these binary codes are treated as supervised information to train an image network, which maps visual modality to a binary representation; in Stage 3, the visual modality codes are reconstructed according to a reconstruction procedure, and used as supervised information to train a text network, which generates the binary codes for textual modality. By doing this, DDCMH can make full use of inter-modal information to obtain high quality binary codes, and avoid the problem of pairwise optimization by optimizing different modalities independently. The proposed method can be treated as a framework which can extend any single-modal hashing method to perform cross-modal search task. DDCMH is tested on several benchmark datasets. The results demonstrate that it outperforms both deep and shallow state-of-the-art hashing methods.
Cite
CITATION STYLE
Chen, Z. D., Yu, W. J., Li, C. X., Nie, L., & Xu, X. S. (2018). Dual deep neural networks cross-modal hashing. In 32nd AAAI Conference on Artificial Intelligence, AAAI 2018 (pp. 274–281). AAAI press. https://doi.org/10.1609/aaai.v32i1.11249
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.