Nautilus: A generic framework for crawling deep web

Jianyu Zhao; Peng Wang

Conference Proceedings

Nautilus: A generic framework for crawling deep web

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7696 141-151

DOI: 10.1007/978-3-642-34679-8_14

2Citations

7Readers

Get full text

Abstract

This paper presents Nautilus, which is a generic framework for crawling deep Web. We provide an abstraction of deep Web crawling process and mechanism of integrating heterogeneous business modules. A Federal Decentralized Architecture is proposed to ensemble advantages of existed P2P networking architectures. We also present effective policies to schedule crawling tasks. Experimental results show our scheduling policies have good performance on load-balance and overall throughput.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhao, J., & Wang, P. (2012). Nautilus: A generic framework for crawling deep web. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7696, pp. 141–151). Springer Verlag. https://doi.org/10.1007/978-3-642-34679-8_14

Nautilus: A generic framework for crawling deep web

Abstract

Author supplied keywords

Cite

Register to see more suggestions