The rapidly growing online social networking sites have been infiltrated by a large amount of spam. In this paper, I focus on one of the most popular sites Twitter as an example to study the spam behaviors. To facilitate the spam detection, a directed social graph model is proposed to explore the "follower" and "friend" relationships among users. Based on Twitter's spam policy, novel content-based features and graph-based features are also proposed. A Web crawler is developed relying on Twitter's API methods. A spam detection prototype system is proposed to identify suspicious users on Twitter. I analyze the data set and evaluate the performance of the detection system. Classic evaluation metrics are used to compare the performance of various traditional classification methods. Experiment results show that the Bayesian classifier has the best overall performance in term of F-measure. The trained Bayesian classifier is also applied to the entire data set to distinguish the suspicious behaviors from normal ones. The result shows that the spam detection system can achieve 89% precision. © 2012 Springer-Verlag.
CITATION STYLE
Wang, A. H. (2012). Machine learning for the detection of spam in twitter networks. In Communications in Computer and Information Science (Vol. 222 CCIS, pp. 319–333). https://doi.org/10.1007/978-3-642-25206-8_21
Mendeley helps you to discover research relevant for your work.