With vertical search engines, it is possible to search the web pages on a specific domain such as products, restaurants or academic papers and present the users only the interested information. Gathering and integrating such objects from multiple web pages into a single system provides a useful facility for users. Placing the extracted objects from multiple data sources into a single hierarchical structure is a challenging classification problem, especially if there are limited object attributes. In this work, we propose a confidence-based incremental Naïve Bayesian approach for categorization, focusing on the product domain. Incremental approach is based on extending the training set and retraining the classifier as new objects are assigned to a category with high confidence. The ordering of product data is taken into account as well. The proposed approach is applied on a vertical search engine that collects product data from several online stores. © 2012 Springer-Verlag.
CITATION STYLE
Ozdikis, O., Senkul, P., & Sinir, S. (2012). Confidence-based incremental classification for objects with limited attributes in vertical search. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7345 LNAI, pp. 10–19). https://doi.org/10.1007/978-3-642-31087-4_2
Mendeley helps you to discover research relevant for your work.