Web crawler for event-driven crawling of AJAX-based web applications

Guoshi Wu; Fanfan Liu

Conference Proceedings

Web crawler for event-driven crawling of AJAX-based web applications

Lecture Notes in Electrical Engineering (2013) 236 LNEE 191-200

DOI: 10.1007/978-1-4614-7010-6_22

3Citations

6Readers

Get full text

Abstract

This paper describes a novel technique for crawling Ajax-based applications through "event-driven" crawling in web browsers. The algorithm uses the browser context to analyse the DOM, scans the DOM-tree, detects elements that are capable of changing the state, triggers events on those elements and extracts dynamic DOM content. For illustration, an AJAX web application is utilized as an example to explain the approach. Additionally, the authors implement the concepts and algorithms discussed in this paper in a tool. Finally, the authors report a number of empirical studies in which they apply their approach to a number of representative AJAX applications. The results show that their method has a better performance often with a faster rate of state discovery. The "event-driven" crawling can effectively and accurately crawl dynamic content from Ajax-based applications. © 2013 Springer Science+Business Media New York.

Author supplied keywords

Cite

CITATION STYLE

APA

Wu, G., & Liu, F. (2013). Web crawler for event-driven crawling of AJAX-based web applications. In Lecture Notes in Electrical Engineering (Vol. 236 LNEE, pp. 191–200). Springer Verlag. https://doi.org/10.1007/978-1-4614-7010-6_22

Web crawler for event-driven crawling of AJAX-based web applications

Abstract

Author supplied keywords

Cite

Register to see more suggestions