Cross-modal event retrieval: A dataset and a baseline using deep semantic learning

Runwei Situ; Zhenguo Yang; Jianming Lv; Qing Li; Wenyin Liu

Conference Proceedings

Cross-modal event retrieval: A dataset and a baseline using deep semantic learning

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11165 LNCS 147-157

DOI: 10.1007/978-3-030-00767-6_14

6Citations

2Readers

Get full text

Abstract

In this paper, we propose to learn Deep Semantic Space (DSS) for cross-modal event retrieval, which is achieved by exploiting deep learning models to extract semantic features from images and textual articles jointly. More specifically, a VGG network is used to transfer deep semantic knowledge from a large-scale image dataset to the target image dataset. Simultaneously, a fully-connected network is designed to model semantic representation from textual features (e.g., TF-IDF, LDA). Furthermore, the obtained deep semantic representations for image and text can be mapped into a high-level semantic space, in which the distance between data samples can be measured straightforwardly for cross-model event retrieval. In particular, we collect a dataset called Wiki-Flickr event dataset for cross-modal event retrieval, where the data are weakly aligned unlike image-text pairs in the existing cross-modal retrieval datasets. Extensive experiments conducted on both the Pascal Sentence dataset and our Wiki-Flickr event dataset show that our DSS outperforms the state-of-the-art approaches.

Author supplied keywords

Cite

CITATION STYLE

APA

Situ, R., Yang, Z., Lv, J., Li, Q., & Liu, W. (2018). Cross-modal event retrieval: A dataset and a baseline using deep semantic learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11165 LNCS, pp. 147–157). Springer Verlag. https://doi.org/10.1007/978-3-030-00767-6_14

Cross-modal event retrieval: A dataset and a baseline using deep semantic learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions