Web spam is troubling both internet users and search engine companies, because it seriously damages the reliability of search engine and the benefit of Web users, degrades the Web information quality. This paper discusses a Web spam detection method inspired by Ant Colony Optimization (ACO) algorithm. The approach consists of two stages: preprocessing and Web spam detection. On preprocessing stage, the class-imbalance problem is solved by using a clustering technique and an optimal feature subset is culled by Chi-square statistics. The dataset is also discretized based on the information entropy method. These works make the spam detection at the second stage more efficient and easier. On next stage, spam detection model is built based on the ant colony optimization algorithm. Experimental results on the WEBSPAM-UK2006 reveal that our approach can achieve the same or even better results with less number of features. © 2014 Springer International Publishing Switzerland.
CITATION STYLE
Tang, S. H., Zhu, Y., Yang, F., & Xu, Q. (2014). Ascertaining spam web pages based on ant colony optimization algorithm. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8645 LNCS, pp. 231–239). Springer Verlag. https://doi.org/10.1007/978-3-319-10085-2_21
Mendeley helps you to discover research relevant for your work.