Twitter Spam is a critical problem and current solution is mainly about machine learning based detection. However, recent studies found that the spam features are continuously changing day by day (called ‘Spam Drift’ problem), which may significantly affect the performance of the detection. In this paper, we carried out a real-data driven study to explored the ‘Spam Drift’ problem and its impact to machine learning based detection. Our study found that only a small group of spam features will continuously change. The results also suggested a counter-intuitive conclusion that the ‘Spam Drift’ problem does not have serious impact on spam detection Precision (SP) and non-spam detection Recall (NR), two metrics that industries prioritise in practice.
CITATION STYLE
Wu, T., Wang, D., Wen, S., & Xiang, Y. (2017). How spam features change in twitter and the impact to machine learning based detection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10701 LNCS, pp. 898–904). Springer Verlag. https://doi.org/10.1007/978-3-319-72359-4_57
Mendeley helps you to discover research relevant for your work.