Summarizing large-scale database schema using community detection

Xue Wang; Xuan Zhou; Shan Wang

Conference Proceedings

Summarizing large-scale database schema using community detection

Journal of Computer Science and Technology (2012) 27(3) 515-526

DOI: 10.1007/s11390-012-1240-1

9Citations

25Readers

Get full text

Abstract

Schema summarization on large-scale databases is a challenge. In a typical large database schema, a great proportion of the tables are closely connected through a few high degree tables. It is thus dificult to separate these tables into clusters that represent different topics. Moreover, as a schema can be very big, the schema summary needs to be structured into multiple levels, to further improve the usability. In this paper, we introduce a new schema summarization approach utilizing the techniques of community detection in social networks. Our approach contains three steps. First, we use a community detection algorithm to divide a database schema into subject groups, each representing a specific subject. Second, we cluster the subject groups into abstract domains to form a multi-level navigation structure. Third, we discover representative tables in each cluster to label the schema summary. We evaluate our approach on Freebase, a real world large-scale database. The results show that our approach can identify subject groups precisely. The generated abstract schema layers are very helpful for users to explore database. © 2012 Springer Science+Business Media, LLC & Science Press, China.

Author supplied keywords

Cite

CITATION STYLE

APA

Wang, X., Zhou, X., & Wang, S. (2012). Summarizing large-scale database schema using community detection. In Journal of Computer Science and Technology (Vol. 27, pp. 515–526). https://doi.org/10.1007/s11390-012-1240-1

Summarizing large-scale database schema using community detection

Abstract

Author supplied keywords

Cite

Register to see more suggestions