The goal of the work presented in this paper is to obtain large amounts of semistructured data from the web. Harvesting semistructured data is a prerequisite to enabling large-scale query answering over web sources. We contrast our approach to conventional web crawlers, and describe and evaluate a five-step pipelined architecture to crawl and index data from both the traditional and the Semantic Web. © Springer-Verlag Berlin Heidelberg 2006.
CITATION STYLE
Harth, A., Umbrich, J., & Decker, S. (2006). MultiCrawler: A pipelined architecture for crawling and indexing semantic web data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4273 LNCS, pp. 258–271). Springer Verlag. https://doi.org/10.1007/11926078_19
Mendeley helps you to discover research relevant for your work.