R2D: A Bridge between the Semanti...
R2D: A Bridge between the Semantic Web and Relational Visualization Tools Sunitha Ramanujam1, Anubha Gupta1, Latifur Khan1, Steven Seida2, Bhavani Thuraisingham1 1 The University of Texas at Dallas Richardson, Texas, U.S.A. {sxr063200, axg089100, lkhan, bxt043000}@utdallas.edu 2 Raytheon Corporation Garland, Texas, U.S.A. steven_b_seida@raytheon.com Abstract ��� The widespread deployment of Resource Description Framework has resulted in the emergence of a new data storage paradigm, the RDF Graph Model, which, in turn, requires a rich suite of modeling and visualization tools to aid with data management. This paper presents R2D (RDF-to- Database), an effort whose goal is to enable reusability of relational tools on RDF data. R2D aims to transform RDF data, at run-time, into an equivalent normalized relational schema, thereby bridging the gap between RDF and RDBMS concepts and making the abundance of existing relational tools available to RDF Stores. The work in this paper extends our earlier work by including the ability to map blank nodes, which are used to represent complex relationships between entities, and to perform pattern matching and aggregation functions on data. The R2D system architecture and mapping constructs, with particular emphasis on blank node handling, are presented along with descriptions of the algorithms comprising R2D. Performance graphs and screen-shots of a relational visualization tool that uses R2D to access RDF data are presented as evidence of the feasibility of our research. Keywords: Semantic Web, Resource Description Framework, Relational Databases, Data Interoperability I. INTRODUCTION In today���s increasingly networked world, the need to augment human reasoning has kicked off the Semantic Web initiative, for which various standards are being developed. One such standard, the Resource Description Framework [1], is the current buzzword in the Semantic Web Community and the focus of the work in this paper. RDF���s simplicity and suitability to unstructured and semi- structured data that is typically available on the web have increased the demand for data stores that use the RDF Graph data model and offer the ability to store and query RDF data [2]. The growing number of RDF stores have, as with any data store with massive amounts of information, spawned an associated requirement of tools for the management and visualization of this data. However, most of the current data modeling, visualization, and business intelligence tools that are widely available in the market today are still based on the more mature relational models [3]. Further, small and medium-sized organizations that are resource constrained may not have the ability or inclination to take risks associated with investing in fledgling technologies such as RDF and the tools for the same [4]. In order to avoid the learning curves associated with new tools and continue to leverage the advantages offered by traditional/ relational tools without losing out on the benefits offered by the newer web technologies and standards, the gap between the two needs to be bridged. The motivation behind our research is to arrive at a solution to the bridging problem without the need to create an actual physical relational schema and duplicate/synchronize data. Our approach, called R2D (RDF-to-Database), provides a relational interface to data stored in the form of RDF triples. R2D, which is a relational wrapper around RDF data stores, is a bridge that hopes to enable existing relational tools to work seamlessly with RDF Stores without having to make extensive modifications or waste valuable resources by replicating data unnecessarily. This paper elaborates on [5] and extends the work in [6] by including the ability to handle blank nodes and RDF container objects. Blank nodes are nodes that are neither URI references nor literals and are typically used to associate a resource with a set of properties that together represent complex data. They are a vital component of RDF graphs and their relationalization is the primary focus of this paper. The paper also discusses enhancements to the SQL- to-SPARQL transformation that now permit pattern matching and aggregation on RDF data. Our contributions in this paper are: ��� We propose a mapping scheme for the translation of RDF Graph structures to an equivalent normalized relational schema that extends the work in [6] by including the ability to process blank nodes and RDF Container objects. ��� Based on the mapping file created, we propose a transformation process that presents, at run-time, a normalized, non-generic, domain-specific, virtual relational schema view of the given RDF store. The algorithm in [6] is extended through the addition of normalization rules for different blank node scenarios. ��� We propose a mechanism, which now includes pattern matching and aggregation facilities, to transform any relational SQL queries issued against the virtual relational schema into the SPARQL equivalent, and return triples data to end-users in a tabular format. ��� The proposed framework imposes no restrictions on the nature of RDF triples or their storage mechanisms as it is a purely virtual layer that does not involve duplication of the RDF data. Hence, data updates are immediately visible through R2D without explicit synchronization activities. 2009 IEEE International Conference on Semantic Computing 978-0-7695-3800-6/09 $26.00 �� 2009 IEEE DOI 10.1109/ICSC.2009.29 303