Current wrapper approaches break down in extracting data from differently structured and frequently changing Web pages. To tackle this challenge, this paper defines domain-specific ontology, captures the semantic hierarchy in Web pages automatically by exploiting both structural information and common formatting information, and recognizes and extracts data by using ontology-based semantic matching without relying on page-specific formatting. It is adaptive to differently structured and frequently changing Web pages for a domain of interest. © Springer-Verlag Berlin Heidelberg 2005.
CITATION STYLE
Li, S., Ou, W., & Yu, J. (2005). Ontology-based HTML to XML conversion. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3739 LNCS, pp. 888–893). Springer Verlag. https://doi.org/10.1007/11563952_98
Mendeley helps you to discover research relevant for your work.