On integrating catalogs

95Citations
Citations of this article
25Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We address the problem of integrating documents from different sources into a master catalog. This problem is pervasive in web marketplaces and portals. Current technology for automating this process consists of building a classifier that uses the categorization of documents in the master catalog to construct a model for predicting the category of unknown documents. Our key insight is that many of the data sources have their own categorization, and classification accuracy can be improved by factoring in the implicit information in these source categorizations. We show how a Naive Bayes classification can be enhanced to incorporate the similarity information present in source catalogs. Our analysis and empirical evaluation show substantial improvement in the accuracy of catalog integration.

Cite

CITATION STYLE

APA

Agrawal, R., & Srikant, R. (2001). On integrating catalogs. In Proceedings of the 10th International Conference on World Wide Web, WWW 2001 (pp. 603–612). Association for Computing Machinery, Inc. https://doi.org/10.1145/371920.372163

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free