The research of spam web page detection method based on web page differentiation and concrete cluster centers

Mei Yu; Jie Zhang; Jianrong Wang; Jie Gao; Tianyi Xu; Ruiguo Yu

Conference Proceedings

The research of spam web page detection method based on web page differentiation and concrete cluster centers

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 10874 LNCS 820-826

DOI: 10.1007/978-3-319-94268-1_73

5Citations

1Readers

Get full text

Abstract

To improve the PageRank algorithm’s disadvantage of assigning link weights evenly and ignoring the authority of web page, we propose an improved PageRank algorithm based on web page differentiation (DPR) which evaluate pages authority according it’s links’ numbers and assign corresponding weights according to its authoritativeness when assigning PR values. To improve the cluster’s stability and accuracy of the K-Means algorithm, we combine DPR with K-Means, design a differentiation page-based K-Means (DPK-Means) algorithm. This algorithm will sort the pages according to the PR value obtained by the DPR algorithm and then concrete cluster centers according to the current sorting result. Experiments show that in spam detection, the DPR is superior to PageRank in terms of pages numbers, recall rate, accuracy, and F-Measure value and DPK-Means has better performance than the K-Means.

Author supplied keywords

Cite

CITATION STYLE

APA

Yu, M., Zhang, J., Wang, J., Gao, J., Xu, T., & Yu, R. (2018). The research of spam web page detection method based on web page differentiation and concrete cluster centers. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10874 LNCS, pp. 820–826). Springer Verlag. https://doi.org/10.1007/978-3-319-94268-1_73

The research of spam web page detection method based on web page differentiation and concrete cluster centers

Abstract

Author supplied keywords

Cite

Register to see more suggestions