Identification of tweets that mention books: An experimental comparison of machine learning methods

Shuntaro Yada; Kyo Kageura

Conference Proceedings

Identification of tweets that mention books: An experimental comparison of machine learning methods

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9469 278-288

DOI: 10.1007/978-3-319-27974-9_30

2Citations

11Readers

Get full text

Abstract

In this paper, we address the task of the identification of tweets on Twitter that mention books (TMB) among tweets that contain the same strings as full book titles. Although this task can be treated as a kind of Named Entity Recognition, the fact that book titles consist of ordinary expressions (such as “The Girl on the Train”) makes the task harder. Furthermore, if tweets are gathered through a dictionary-based search, the tweets that contain the same strings as full book titles are often spam. However, assuming a complete list of book titles (i.e. from a union catalogue from a library or commercial bibliographic data from a book store), this task can be solved by text classification. Thus, we proposed a two-step pipeline consisting of spam filtering and TMB classification based on supervised learning with a small amount of labelled data. We constructed optimal classifiers by comparing combinations of four proven supervised learning methods with different features. Given the difficulty of the task, our pipeline performed highly (about 0.7 in terms of F-score).

Author supplied keywords

Cite

CITATION STYLE

APA

Yada, S., & Kageura, K. (2015). Identification of tweets that mention books: An experimental comparison of machine learning methods. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9469, pp. 278–288). Springer Verlag. https://doi.org/10.1007/978-3-319-27974-9_30

Identification of tweets that mention books: An experimental comparison of machine learning methods

Abstract

Author supplied keywords

Cite

Register to see more suggestions