When utilising matrix factorisation to extract latent features for cross-media retrieval, semantic information may be lost in the process of factorisation. In addition, many presented approaches directly mapped different modalities into an isomorphic semantic space to conduct the similarity measurement of different modalities, which also resulted in the loss of crucial information. To address these problems, a semantic convex matrix factorisation subspace learning approach is proposed for cross-media retrieval between image and text. The proposed method can extract an intermediate-level feature representation for the high dimensional image modality in order to weaken the loss of information, in the meantime, learn a semantic feature representation with semantic information for the lower dimension text modality to strengthen the discriminated capability. After that, the intermediate-level feature representation of image is mapped into a latent semantic space by a projection matrix. Then the similarity of different modalities can be estimated in terms of uniform dimensional latent feature representations. Experimental results on three benchmark datasets demonstrate the superiority of the proposed approach over several state-of-the-art approaches.
CITATION STYLE
Fang, Y., Ren, Y., & Zhang, H. (2019). Semantic convex matrix factorisation for cross-media retrieval. IET Image Processing, 13(1), 196–205. https://doi.org/10.1049/iet-ipr.2018.5853
Mendeley helps you to discover research relevant for your work.