Web pages change frequently and thus crawlers have to down-load them often. Various policies have been proposed for refreshing local copies of web pages. In this paper, we introduce a new sampling method that excels over other change detection methods in experiment. Change Frequency (CF) is a method that predicts the change frequency of the pages and, in the long run, achieves an optimal efficiency in comparison with the sampling method. Here, we propose a new hybrid method that is a combination of our new sampling approach and CF and show how our hybrid method improves the efficiency of change detection. © Springer-Verlag Berlin Heidelberg 2005.
CITATION STYLE
Ghodsi, M., Hassanzadeh, O., Kamali, S., & Monemizadeh, M. (2005). A hybrid approach for refreshing web page repositories. In Lecture Notes in Computer Science (Vol. 3453, pp. 588–593). Springer Verlag. https://doi.org/10.1007/11408079_54
Mendeley helps you to discover research relevant for your work.