A Hierarchical Document Clustering Approach with Frequent Itemsets

  • Lee C
  • Hsu C
  • et al.
N/ACitations
Citations of this article
14Readers
Mendeley users who have this article in their library.

Abstract

 Abstract—In order to effectively retrieve required information from the large amount of information collected from the Internet, document clustering in text mining becomes a popular research topic. Clustering is the unsupervised classification of data items into groups without the need of training data. Many conventional document clustering methods perform inefficiently for large document of collected information and require special handling for high dimensionality and high volume. We propose the OCFI (Ontology and Closed Frequent Itemset-based Hierarchical Clustering) method, which is a hierarchical clustering method developed for document clustering. OCFI uses common words to cluster documents and builds hierarchical topic tree. In addition, OCFI utilizes ontology to solve the semantic problem and mine the meaning behind the words in documents. Furthermore, we use the closed frequent itemsets instead of only use frequent itemsets, which increases efficiency and scalability. The experimental results reveal that our method is more effective than the well-known document clustering algorithms. The clustering results can be used in the personalized search service to assist users to obtain the information they need.

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Lee, C.-J., Hsu, C.-C., & Chen, D.-R. (2017). A Hierarchical Document Clustering Approach with Frequent Itemsets. International Journal of Engineering and Technology, 9(2), 174–178. https://doi.org/10.7763/ijet.2017.v9.965

Readers over time

‘17‘18‘19‘20‘21‘22‘23‘2401234

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 5

83%

Lecturer / Post doc 1

17%

Readers' Discipline

Tooltip

Computer Science 8

89%

Decision Sciences 1

11%

Save time finding and organizing research with Mendeley

Sign up for free
0