Exploring Textual Features for Multi-label Classification of Portuguese Film Synopses

5Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The multi-label classification of film genres by using features extracted from their synopses has recently gained some attention from the scientific community, however, the number of studies is still limited. These studies are even scarcer for languages other than English. In this work we present the P-TMDb dataset, which contains 13, 394 Portuguese film synopses, and explore the film genre classification by experimenting with nine different groups of textual features and four multi-label algorithms. As our dataset is unbalanced, we also conducted experiments with an oversampled version of the dataset. The best result obtained for the original dataset was achieved by a TF-IDF based classifier, presenting an average F1 score of 0.478, while the best result for the oversampled dataset was achieved by a combination of several feature groups and presented an average F1 score of 0.611.

Cite

CITATION STYLE

APA

Portolese, G., Domingues, M. A., & Feltrim, V. D. (2019). Exploring Textual Features for Multi-label Classification of Portuguese Film Synopses. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11805 LNAI, pp. 669–681). Springer Verlag. https://doi.org/10.1007/978-3-030-30244-3_55

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free