A Hierarchical Document Clustering Approach with Frequent Itemsets

undefined; Cheng-Jhe Lee; Chiun-Chieh Hsu; Da-Ren Chen

Journal ArticleOPEN ACCESS

A Hierarchical Document Clustering Approach with Frequent Itemsets

Lee C
Hsu C
et al.

International Journal of Engineering and Technology (2017) 9(2) 174-178

DOI: 10.7763/ijet.2017.v9.965

N/ACitations

14Readers

Abstract

 Abstract—In order to effectively retrieve required information from the large amount of information collected from the Internet, document clustering in text mining becomes a popular research topic. Clustering is the unsupervised classification of data items into groups without the need of training data. Many conventional document clustering methods perform inefficiently for large document of collected information and require special handling for high dimensionality and high volume. We propose the OCFI (Ontology and Closed Frequent Itemset-based Hierarchical Clustering) method, which is a hierarchical clustering method developed for document clustering. OCFI uses common words to cluster documents and builds hierarchical topic tree. In addition, OCFI utilizes ontology to solve the semantic problem and mine the meaning behind the words in documents. Furthermore, we use the closed frequent itemsets instead of only use frequent itemsets, which increases efficiency and scalability. The experimental results reveal that our method is more effective than the well-known document clustering algorithms. The clustering results can be used in the personalized search service to assist users to obtain the information they need.

Cite

CITATION STYLE

APA

Lee, C.-J., Hsu, C.-C., & Chen, D.-R. (2017). A Hierarchical Document Clustering Approach with Frequent Itemsets. International Journal of Engineering and Technology, 9(2), 174–178. https://doi.org/10.7763/ijet.2017.v9.965

A Hierarchical Document Clustering Approach with Frequent Itemsets

Abstract

Cite

Register to see more suggestions