Quantifying the bias in data links

Ilaria Tiddi; Mathieu D’aquin; Enrico Motta

Conference Proceedings

Quantifying the bias in data links

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8876 531-546

DOI: 10.1007/978-3-319-13704-9_40

4Citations

6Readers

Get full text

Abstract

The main idea behind Linked Data is to connect data from different sources together, in order to develop a hub of shared and publicly accessible knowledge. While the benefit of sharing knowledge is universally recognised, what is less visible is how much results can be affected when the knowledge in one dataset and in the connected ones are not equally distributed. This lack of balance in information, or bias, generally assumed a priori, can actually be quantified to improve the quality of the results of applications and analytics relying on such linked data. In this paper, we propose a process to measure how much bias one dataset contains when compared to another one, by identifying the most affected RDF properties and values within the set of entities that those datasets have in common (defined as the linkset). This process was ran on a wide range of linksets from Linked Data, and in the experiment section we present the results as well as measures of its performance.

Author supplied keywords

Cite

CITATION STYLE

APA

Tiddi, I., D’aquin, M., & Motta, E. (2014). Quantifying the bias in data links. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8876, pp. 531–546). Springer Verlag. https://doi.org/10.1007/978-3-319-13704-9_40

Quantifying the bias in data links

Abstract

Author supplied keywords

Cite

Register to see more suggestions