Automatic extraction of logical web lists

6Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Recently, there has been increased interest in the extraction of structured data from the web (both "Surface" Web and"Hidden" Web). In particular, in this paper we focus on the automatic extraction of Web Lists. Although this task has been studied extensively, existing approaches are based on the assumption that lists are wholly contained in a Web page.They do not consider that many websites span their listing on several Web Pages and show for each of these only a partial view. Similar to databases, where a view can represent a subset of the data contained in a table, they split a logical list in multiple views (view lists). Automatic extraction of logical lists is an open problem. To tackle this issue we propose an unsupervised and domain-independent algorithm for logical list extraction. Experimental results on real-life and data-intensive Web sites confirm the effectiveness of our approach. © 2014 Springer International Publishing.

Cite

CITATION STYLE

APA

Lanotte, P. F., Fumarola, F., Ceci, M., Scarpino, A., Torelli, M. D., & Malerba, D. (2014). Automatic extraction of logical web lists. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8502 LNAI, pp. 365–374). Springer Verlag. https://doi.org/10.1007/978-3-319-08326-1_37

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free