In this big data age, extensive requirements emerge in data management and data analysis fields. Heterogeneous information networks (HIN) are widely used as data models due to their rich semantics in expressing complex data correlations. The data similarities other than the exact matches are required in many data mining, data analysis and machine learning algorithms. Graph edit distance (GED) is one of the feasible methods on HIN similarity measuring. In this paper, we firstly extend the concept of GED in homogeneous graphs to the heterogeneous information networks by introducing newly defined edit operations. The metapath-based approximation method is then proposed to improve the performance of full database similarity search, in which a upper bound and a lower bound, both of polynomial time complexity, are utilized as filters. Finally, comprehensive experimental results show the proposed method outperforms the existed method in terms of computational efficiency, bound tightness and similarity filtering capability.
CITATION STYLE
Lu, J., Lu, N., Ma, S., & Zhang, B. (2018). Edit distance based similarity search of heterogeneous information networks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11030 LNCS, pp. 195–202). Springer Verlag. https://doi.org/10.1007/978-3-319-98812-2_16
Mendeley helps you to discover research relevant for your work.