Currently, micro-videos are becoming one of the most representative products in the new media age. Although the length of micro-videos is limited to cater to the fast pace of life and are beneficial for rapid distribution, micro-videos are usually recorded in specific scenarios and tend to convey relatively complete events. To more accurately obtain the event types of micro-videos to facilitate potential applications, we propose a low-rank regularized multimodal representation method for micro-video event detection. To solve the less descriptive power of each modality, the latent common representation of micro-videos is obtained by exploiting complementarity among modalities. A considerable gain in accuracy on this basis can be achieved by further considering the low-rank constraint for the lowest-rank intrinsic representation and a flexible label-relaxation strategy for mappings between representations and their correspondences. A newly constructed micro-video dataset is used to verify the advantages of our proposed model. The experimental results demonstrated the superior performance of our proposed method compared with state-of-the-art methods.
CITATION STYLE
Zhang, J., Wu, Y., Liu, J., Jing, P., & Su, Y. (2020). Low-Rank Regularized Multimodal Representation for Micro-Video Event Detection. IEEE Access, 8, 87266–87274. https://doi.org/10.1109/ACCESS.2020.2992436
Mendeley helps you to discover research relevant for your work.