Clustering Analysis to Improve Web Search Ranking Using PCA and RMSE

1Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Classification of web pages is the first step of web page ranking (or we can call it indexing), one of the most common ways to achieve indexing process is clustering that pages into groups as per the similarity, whenever the misclassification is less, the result will be perfect. Moreover, clustering is a collection of algorithms that dive the data into groups related to each other. Thus, we chose Microsoft learn to rank dataset, to achieve the analysis and model building on it, this dataset is specially designed for researches in this field, and it has huge and different information about ranking process. Because of the quantity of the information, we chose randomly 16,015 observations only from MSLR-WEB30K_2 _ fold 1, in this study according to the ability of our hardware, and the algorithms of analysis, some of algorithms which were used in analysis (determine the optimal number of clusters) cannot handle the huge quantity of observations. Hence, in this paper, we are going to use clustering analysis to improve the web search ranking using principle component analysis (PCA) with root main square error as a feature reduction technique to compute the errors rate and the accuracy of the model result to get the best number of attributes; this process was achieved with cross-validation approach using extreme gradient boost algorithm as a training model to estimate the sum of errors during training operation.

Cite

CITATION STYLE

APA

Ko’adan, M. A., Bamatraf, M. A., & Shafal, K. Q. (2021). Clustering Analysis to Improve Web Search Ranking Using PCA and RMSE. In Advances in Intelligent Systems and Computing (Vol. 1188, pp. 93–105). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-15-6048-4_9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free