Multimodal PLSA for movie genre classification

Hao Zhi Hong; Jen Ing G. Hwang

Conference Proceedings

Multimodal PLSA for movie genre classification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9132 159-167

DOI: 10.1007/978-3-319-20248-8_14

7Citations

5Readers

Get full text

Abstract

The aim of this paper is to categorize movies into genres using the previews. Our study attempts to combine audio, visual and text features to classify a collection of movie previews into action, biography, comedy, and horror. For each of the collected previews, the audio and visual features are extracted and the text features are drawn from social tags via social websites. The probabilistic latent semantic analysis (PLSA) is used to incorporate the features from these three different aspects of information. The standard PLSA processes one type of information only. Therefore double-model and triple-model PLSAs are extended in order to combine two or three different types of information. We compare these various variants of PLSA approaches with unimodal PLSAs, which use either audio, visual or text features only. The experimental results show not only that one of the triple-model PLSAs achieves the highest accuracy, but also that social tags (text features) play an important role for classifying movies genres.

Author supplied keywords

Cite

CITATION STYLE

APA

Hong, H. Z., & Hwang, J. I. G. (2015). Multimodal PLSA for movie genre classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9132, pp. 159–167). Springer Verlag. https://doi.org/10.1007/978-3-319-20248-8_14

Multimodal PLSA for movie genre classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions