Age identification of twitter users: Classification methods and sociolinguistic analysis

Vasiliki Simaki; Iosif Mporas; Vasileios Megalooikonomou

Conference Proceedings

Age identification of twitter users: Classification methods and sociolinguistic analysis

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 9624 LNCS 385-395

DOI: 10.1007/978-3-319-75487-1_30

2Citations

16Readers

Get full text

Abstract

In this article, we address the problem of age identification of Twitter users, after their online text. We used a set of text mining, sociolinguistic-based and content-related text features, and we evaluated a number of well-known and widely used machine learning algorithms for classification, in order to examine their appropriateness on this task. The experimental results showed that Random Forest algorithm offered superior performance achieving accuracy equal to 61%. We ranked the classification features after their informativity, using the ReliefF algorithm, and we analyzed the results in terms of the sociolinguistic principles on age linguistic variation.

Author supplied keywords

Cite

CITATION STYLE

APA

Simaki, V., Mporas, I., & Megalooikonomou, V. (2018). Age identification of twitter users: Classification methods and sociolinguistic analysis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9624 LNCS, pp. 385–395). Springer Verlag. https://doi.org/10.1007/978-3-319-75487-1_30

Age identification of twitter users: Classification methods and sociolinguistic analysis

Abstract

Author supplied keywords

Cite

Register to see more suggestions