Multimodal PLSA for movie genre classification

7Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The aim of this paper is to categorize movies into genres using the previews. Our study attempts to combine audio, visual and text features to classify a collection of movie previews into action, biography, comedy, and horror. For each of the collected previews, the audio and visual features are extracted and the text features are drawn from social tags via social websites. The probabilistic latent semantic analysis (PLSA) is used to incorporate the features from these three different aspects of information. The standard PLSA processes one type of information only. Therefore double-model and triple-model PLSAs are extended in order to combine two or three different types of information. We compare these various variants of PLSA approaches with unimodal PLSAs, which use either audio, visual or text features only. The experimental results show not only that one of the triple-model PLSAs achieves the highest accuracy, but also that social tags (text features) play an important role for classifying movies genres.

Cite

CITATION STYLE

APA

Hong, H. Z., & Hwang, J. I. G. (2015). Multimodal PLSA for movie genre classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9132, pp. 159–167). Springer Verlag. https://doi.org/10.1007/978-3-319-20248-8_14

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free