Machine learning for the detection of spam in twitter networks

10Citations
Citations of this article
31Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The rapidly growing online social networking sites have been infiltrated by a large amount of spam. In this paper, I focus on one of the most popular sites Twitter as an example to study the spam behaviors. To facilitate the spam detection, a directed social graph model is proposed to explore the "follower" and "friend" relationships among users. Based on Twitter's spam policy, novel content-based features and graph-based features are also proposed. A Web crawler is developed relying on Twitter's API methods. A spam detection prototype system is proposed to identify suspicious users on Twitter. I analyze the data set and evaluate the performance of the detection system. Classic evaluation metrics are used to compare the performance of various traditional classification methods. Experiment results show that the Bayesian classifier has the best overall performance in term of F-measure. The trained Bayesian classifier is also applied to the entire data set to distinguish the suspicious behaviors from normal ones. The result shows that the spam detection system can achieve 89% precision. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Wang, A. H. (2012). Machine learning for the detection of spam in twitter networks. In Communications in Computer and Information Science (Vol. 222 CCIS, pp. 319–333). https://doi.org/10.1007/978-3-642-25206-8_21

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free