Most of existing web page clustering algorithms is based on short and uneven snippets of web page, which often cause bad clustering performance. On the other hand, the classical clustering algorithm for full text web pages is too complex to provide good cluster label in addition to the incapability on-line clustering. To address above problems, this article presents an on-line web page clustering algorithm based on maximal frequent item sets (MFIC). At first, the maximal frequent item sets are mined, and then the web pages are clustered based on shared frequent item sets. Secondly, clusters are labelled based on the frequent items. Experimental results show that MFIC can effectively reduce clustering time, improve clustering accuracy by 15%, and generate understandable labels. © 2011 Published by Elsevier Ltd.
Wei, Y. W. (2011). The clustering algorithm of query result based on maximal frequent. In Procedia Engineering (Vol. 15, pp. 1642–1646). https://doi.org/10.1016/j.proeng.2011.08.306