Machine learning for the detection of spam in twitter networks

Alex Hai Wang

Conference Proceedings

Machine learning for the detection of spam in twitter networks

Wang A

Communications in Computer and Information Science (2012) 222 CCIS 319-333

DOI: 10.1007/978-3-642-25206-8_21

10Citations

31Readers

Get full text

Abstract

The rapidly growing online social networking sites have been infiltrated by a large amount of spam. In this paper, I focus on one of the most popular sites Twitter as an example to study the spam behaviors. To facilitate the spam detection, a directed social graph model is proposed to explore the "follower" and "friend" relationships among users. Based on Twitter's spam policy, novel content-based features and graph-based features are also proposed. A Web crawler is developed relying on Twitter's API methods. A spam detection prototype system is proposed to identify suspicious users on Twitter. I analyze the data set and evaluate the performance of the detection system. Classic evaluation metrics are used to compare the performance of various traditional classification methods. Experiment results show that the Bayesian classifier has the best overall performance in term of F-measure. The trained Bayesian classifier is also applied to the entire data set to distinguish the suspicious behaviors from normal ones. The result shows that the spam detection system can achieve 89% precision. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Wang, A. H. (2012). Machine learning for the detection of spam in twitter networks. In Communications in Computer and Information Science (Vol. 222 CCIS, pp. 319–333). https://doi.org/10.1007/978-3-642-25206-8_21

Machine learning for the detection of spam in twitter networks

Abstract

Cite

Register to see more suggestions