Schema matching is critical for applications that manipulate data across heterogeneous, autonomous and scattered data sources. We pick the schema matching approach based on the total number of data sources we wish to integrate: holistic matching approaches are ideally used for a big to a huge total number of data sources, while pairwise matching approaches are ideally used for a small to a medium total number of data sources. Nonetheless, the state of the art matching approaches obtain a very moderate (sometimes poor) matching accuracy. Furthermore, the state of the art holistic schema matching approaches proceed in a series of two-way matching steps. In this paper, we present hMatcher, an effective approach to holistic schema matching. To perform collective schema matching, hMatcher generates frequent schema elements before proceeding with the matching. To reach high matching accuracy, hMatcher employs a context-based semantic similarity measure. Experimental results on a real-world domain dataset show that hMatcher performs holistic schema matching properly, reaches a high matching accuracy (Precision=0.89;Recall=0.66;Overall=0.57), and outperforms the state of the art matching approaches in terms of matching accuracy.
CITATION STYLE
Yousfi, A., El Yazidi, M. H., & Zellou, A. (2020). hMatcher: Matching schemas holistically. International Journal of Intelligent Engineering and Systems, 13(5), 490–501. https://doi.org/10.22266/ijies2020.1031.43
Mendeley helps you to discover research relevant for your work.