Reading book by the cover—Book genre detection using short descriptions

Antoni Sobkowicz; Marek Kozłowski; Przemysław Buczkowski

Conference Proceedings

Reading book by the cover—Book genre detection using short descriptions

Advances in Intelligent Systems and Computing (2018) 659 439-448

DOI: 10.1007/978-3-319-67792-7_43

3Citations

6Readers

Get full text

Abstract

The paper is devoted to the issue of short text classification, working on free textual descriptions of books, gathered by crawling the GoodReads portal. Those descriptions are short, often incomplete, and highly biased towards the genre of their respective books, so that establishing a notion of proximity between such texts is a challenging task. Each book was assigned multiple categories from the total number of 506 categories, which makes the problem of genre distribution statistically significant. In addition, the number of the descriptions varies from genre to genre, causing the data to be imbalanced. In order to choose the best text classification method for this specific task, we examine different methods, including baseline naive Bayes models and semantic enrichment methods consuming neural-based distributional models. The algorithms have been evaluated in terms of the classification quality on the unique data set of almost two hundred thousands book descriptions.

Author supplied keywords

Cite

CITATION STYLE

APA

Sobkowicz, A., Kozłowski, M., & Buczkowski, P. (2018). Reading book by the cover—Book genre detection using short descriptions. In Advances in Intelligent Systems and Computing (Vol. 659, pp. 439–448). Springer Verlag. https://doi.org/10.1007/978-3-319-67792-7_43

Reading book by the cover—Book genre detection using short descriptions

Abstract

Author supplied keywords

Cite

Register to see more suggestions