Web Crawling

  • Olston C
  • Najork M
  • 187

    Readers

    Mendeley users who have this article in their library.
  • 132

    Citations

    Citations of this article.

Abstract

This is a survey of the science and practice of web crawling. While at first glance web crawling may appear to be merely an application of breadth-first-search, the truth is that there are many challenges ranging from systems concerns such as managing very large data structures to theoretical questions such as how often to revisit evolving content sources. This survey outlines the fundamental challenges and describes the state-of-the-art models and solutions. It also highlights avenues for future work.

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Get full text

Authors

  • Christopher Olston

  • Marc Najork

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free