As the web documents are raising at high scale, it is very difficult to access useful information. Search engines play a major role in retrieval of relevant information and knowledge. They deal with managing large amount of information with efficient page ranking algorithms. Still web spammers try to intrude the search engine results by various web spamming techniques for their personal benefit. According to the recent report from Internetlivestats in March (2016), an Internet survey company, states that there are currently 3.4 billion Internet users in the world. From this survey it can be judged that the search engines play a vital role in retrieval of information. In this research, we have investigated fifteen different machine learning classification algorithms over content based features to classify the spam and non spam web pages. Ensemble approach is done by using three algorithms which are computed as best on the basis of various parameters. Ten Fold Cross-validation approach is also used.
CITATION STYLE
Makkar, A., & Goel, S. (2017). Spammer classification using ensemble methods over content-based features. In Advances in Intelligent Systems and Computing (Vol. 547, pp. 1–9). Springer Verlag. https://doi.org/10.1007/978-981-10-3325-4_1
Mendeley helps you to discover research relevant for your work.