The large amount of available Web data sources is an important opportunity for Web users and also for various data-intensive Web applications. Nevertheless, the selection of the most relevant data sources and thus of high quality information is still a challenging issue. This paper proposes an approach for data source selection that is based on the notion of reputation of the data sources. The data quality literature defines reputation as a multidimensional quality attribute that measures the trustworthiness and importance of an information source. This paper introduces a set of metrics able to measure the reputation of a Web source by considering its authority, its relevance in a given context, and the quality of the content. These variables have been empirically assessed for the top 20 sources identified by Google as a response to 100 queries in the tourism domain. In particular, Google's ranking has been compared with the ranking obtained by means of a multi-dimensional source reputation index. Results show that the assessment of reputation represents a tangible aid to the selection of information sources and to identification of reliable data. © 2011 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Barbagallo, D., Cappiello, C., Francalanci, C., & Matera, M. (2011). Enhancing the selection of Web sources: A reputation based approach. In Lecture Notes in Business Information Processing (Vol. 73 LNBIP, pp. 464–476). Springer Verlag. https://doi.org/10.1007/978-3-642-19802-1_32
Mendeley helps you to discover research relevant for your work.