Many organizations/individuals face the problem of managing a large amount of distributed and heterogeneous data in an efficient manner. The dataspace technology addresses this problem in an efficient manner. A dataspace system is a new abstraction for integrating heterogeneous data sources distributed over the sites that offers on-demand data integration solution with less effort and provides an integrated way of searching & querying capability over heterogeneous data sources. We require the set of automatic wrappers to extract the desired data from their data sources. A wrapper extracts the requested data from their respective data sources, and populates them into the dataspace in desired format (e.g., triple formate). This work presents a set of rule-based wrappers for a dataspace system that wrappers operate in”pay-as-you-go” manner. We have divided our work into two parts: Discussing a set of Transformation Rules (TRSs) and designing of a set of wrappers based on the TRSs. First, we explain the working of the TRSs for structured, semi-structured, and unstructured data model, then, we discuss the designing of rule-based wrappers for dataspace system based on TRSs. We have successfully implemented the wrapper for some real and synthetic data sets. Our some of the wrappers are semi-automatic because they requires the human involvement during the data extraction and translation.
CITATION STYLE
Singh, M., Lal, N., & Yadav, S. (2019). Rule-based wrappers for a dataspace system. International Journal of Innovative Technology and Exploring Engineering, 8(6), 80–90.
Mendeley helps you to discover research relevant for your work.