In this paper, we propose a new learning approach to Web data annotation, where a support vector machine-based multiclass classifier is trained to assign labels to data items. For data record extraction, a data section re-segmentation algorithm based on visual and content features is introduced to improve the performance of Web data record extraction. We have implemented the proposed approach and tested it with a large set of Web query result pages in different domains. Our experimental results show that our proposed approach is highly effective and efficient.
CITATION STYLE
Weng, D., Hong, J., & Bell, D. A. (2014). Automatically annotating structured Web data using a SVM-based multiclass classifier. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8786, 115–124. https://doi.org/10.1007/978-3-319-11749-2_9
Mendeley helps you to discover research relevant for your work.