Sign up & Download
Sign in

An empirical survey of linked data conformance

by Aidan Hogan, Jürgen Umbrich, Andreas Harth, Richard Cyganiak, Axel Polleres, Stefan Decker
Web Semantics Science Services and Agents on the World Wide Web (2012)

Abstract

There has been a recent, tangible growth in RDF published on the Web in accordance with the Linked Data principles and best practices, the result of which has been dubbed the Web of Data. Linked Data guidelines are designed to facilitate ad hoc re-use and integration of conformant structured dataacross the Webby consumer applications; however, thus far, systems have yet to emerge that convincingly demonstrate the potential applications for consuming currently available Linked Data. Herein, we compile a list of fourteen concrete guidelines as given in the How to Publish Linked Data on the Web tutorial. Thereafter, we evaluate conformance of current RDF data providers with respect to these guidelines. Our evaluation is based on quantitative empirical analyses of a crawl of 4 million RDF/XML documents constituting over 1 billion quadruples, where we also look at the stability of hosted documents for a corpus consisting of nine monthly snapshots from a sample of 151 thousand documents. Backed by our empirical survey, we provide insights into the current level of conformance with respect to various Linked Data guidelines, enumerating lists of the most (non-)conformant data providers. We show that certain guidelines are broadly adhered to (esp. use HTTP URIs, keep URIs stable), whilst others are commonly overlooked (esp. provide licencing and human-readable meta-data). We also compare PageRank scores for the data-providers and their conformance to Linked Data guidelines, showing that both factors negatively correlate for guidelines restricting use of RDF features, while positively correlating for guidelines encouraging external linkage and vocabulary re-use. Finally, we present a summary of conformance for the different guidelines, and present the top-ranked data providers in terms of a combined PageRank and Linked Data conformance score.

Author-supplied keywords

Cite this document (BETA)

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

5 Readers on Mendeley
by Discipline
 
 
by Academic Status
 
40% Ph.D. Student
 
20% Student (Bachelor)
 
20% Librarian
by Country
 
40% United States
 
20% Sweden
 
20% Portugal

Groups

Freebase