An Efficient Approach for Geo-Multimedia Cross-Modal Retrieval

7Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Due to the rapid development of mobile Internet techniques, such as online social networking and location-based services, massive amount of multimedia data with geographical information is generated and uploaded to the Internet. In this paper, we propose a novel type of cross-modal multimedia retrieval, called geo-multimedia cross-modal retrieval, which aims to find a set of geo-multimedia objects according to geographical distance proximity and semantic concept similarity. Previous studies for cross-modal retrieval and spatial keyword search cannot address this problem effectively because they do not consider multimedia data with geo-tags (geo-multimedia). Firstly, we present the definition of k NN geo-multimedia cross-modal query and introduce relevant concepts such as spatial distance and semantic similarity measurement. As the key notion of this work, cross-modal semantic representation space is formulated at the first time. A novel framework for geo-multimedia cross-modal retrieval is proposed, which includes multi-modal feature extraction, cross-modal semantic space mapping, geo-multimedia spatial index and cross-modal semantic similarity measurement. To bridge the semantic gap between different modalities, we also propose a method named cross-modal semantic matching (CoSMat for shot) which contains two important components, i.e., CorrProj and LogsTran, which aims to build a common semantic representation space for cross-modal semantic similarity measurement. In addition, to implement semantic similarity measurement, we employ deep learning based method to learn multi-modal features that contains more high level semantic information. Moreover, a novel hybrid index, GMR-Tree is carefully designed, which combines signatures of semantic representations and R-Tree. An efficient GMR-Tree based k NN search algorithm called k GMCMS is developed. Comprehensive experimental evaluations on real and synthetic datasets clearly demonstrate that our approach outperforms the-state-of-the-art methods.

Cite

CITATION STYLE

APA

Zhu, L., Long, J., Zhang, C., Yu, W., Yuan, X., & Sun, L. (2019). An Efficient Approach for Geo-Multimedia Cross-Modal Retrieval. IEEE Access, 7, 180571–180589. https://doi.org/10.1109/ACCESS.2019.2940055

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free