A variety of machine-learning techniques have been proposed, over the last decade, to build spam identification models. However, most of these models depend entirely on the extracted features and perform more efficiently when used by large datasets. This paper proposes a temporal spam identification algorithm, which makes use of time series, to filter suspicious reviews from a Yelp review dataset. Based on those labelled suspicious reviews, this algorithm employs feature-engineering techniques. We use a combination of behavioral, review-centric features and word and character n-grams. We classify spam and ham reviews, by using a support vector machine. The proposed method can be used in real-time spam detection systems. A comparison with two other approaches indicates that the algorithm proposed in this paper achieves a higher accuracy (94%). Our proposed algorithm reduces the scope of searching, and huge computations, required for spam detection in large datasets.
CITATION STYLE
Muhammad, I., Qamar, U., & Khan, F. H. (2018). Temporal spam identification: A multifaceted approach to identifying review spam. In Advances in Intelligent Systems and Computing (Vol. 869, pp. 773–787). Springer Verlag. https://doi.org/10.1007/978-3-030-01057-7_58
Mendeley helps you to discover research relevant for your work.