Characterizing crawler behavior from web server access logs

15Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we present a study of crawler behavior based on Web-server access logs. To this end, we use logs from five different academic sites in three countries. Based on these logs, we analyze the activity of different crawlers that belong to five Search Engines: Google, AltaVista, Inktomi, FastSearch and CiteSeer. We compare crawler behavior to the characteristics of the general World-Wide Web traffic, and to general characterization studies based on Web-server access logs. We analyze crawler requests to derive insights into the behavior and strategy of crawlers. Our results and observations provide useful insights into crawler behavior and serve as basis of our ongoing work on the automatic detection of WWW robots.

Cite

CITATION STYLE

APA

Dikaiakos, M., Stassopoulou, A., & Papageorgiou, L. (2003). Characterizing crawler behavior from web server access logs. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2738, pp. 369–378). Springer Verlag. https://doi.org/10.1007/978-3-540-45229-4_36

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free