Sockpuppets are online identities controlled by a user or group of users to manipulate the dissemination of information in digital environments. This manipulation can distort computational assessments of public opinion in social media. Using Russian-language Twitter data from the Ukrainian crisis in 2014, we present a proof-of-concept model employing character n-gram methods to detect sockpuppets. Previous research has demonstrated that n-gram authorship attribution methods can capture lexical preferences, including grammatical and orthographic preferences, while also being less computationally intensive than grammatical or compression language models. Additionally, they can be applied to any language data irrespective of orthography. In this study, a Naïve Bayes classifier was constructed using normalized frequencies of parsed character bigrams to contrast author bigram use. The created model illustrated that suspected sockpuppet accounts were less likely to be correctly classified, showing lower precision, recall, and f-measure rates than other accounts, as predicted.
CITATION STYLE
Crabb, E. S., Mishler, A., Paletz, S., Hefright, B., & Golonka, E. (2015). Reading between the lines: A prototype model for detecting twitter sockpuppet accounts using language-agnostic processes. In Communications in Computer and Information Science (Vol. 528, pp. 656–661). Springer Verlag. https://doi.org/10.1007/978-3-319-21380-4_111
Mendeley helps you to discover research relevant for your work.