Joint learning of lstms-cnn and prototype for micro-video venue classification

7Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Generally, venue category information of the micro-video is an important cue in social network applications, such as location-oriented applications and personalized services. In the existing micro-video venue classification methods, the discrimination becomes worse due to unsuitable convolutional filter and convolutional padding, and the robustness is not enough that is caused by the softmax layer. In order to alleviate such problems, we propose a novel learning framework which jointly learns LSTMs-CNN and Prototype for micro-video venue classification. Specifically, LSTMs-CNN with convolutional padding of the SAME type and small convolutional filter is used to extract spatio-temporal information. The Prototype is simultaneously learned to improve the robustness against softmax classification function. We adopt Euclidean distance loss function to train the whole network. Extensive experimental results on a real-world dataset show that our model significantly outperforms the state-of-the-art baselines in terms of both Micro-F and Macro-F scores.

Cite

CITATION STYLE

APA

Liu, W., Huang, X., Cao, G., Song, G., & Yang, L. (2018). Joint learning of lstms-cnn and prototype for micro-video venue classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11165 LNCS, pp. 705–715). Springer Verlag. https://doi.org/10.1007/978-3-030-00767-6_65

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free