A huffman tree-based algorithm for clustering documents

Yaqiong Liu; Yuzhuo Wen; Dingrong Yuan; Yuwei Cuan

Journal Article

A huffman tree-based algorithm for clustering documents

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8933 630-640

DOI: 10.1007/978-3-319-14717-8_49

0Citations

6Readers

Get full text

Abstract

Text information processing is one of the important topics in data mining. It involves the techniques of statistics, machine learning, pattern recognition etc. In the age of big data, a huge amount of text data has been accumulated. At present, the most effective text processing way is classifying them before mining. Therefore, it has attracted great interests of scholars and researchers, and many constructive results have been achieved. But along with the increasing of training samples, the shortages of techniques and limits of their application have appeared gradually. In this paper, we propose a new strategy for classifying documents based on Huffman tree. Firstly, we find out all the candidate classifications by generating a Huffman tree, and then we design a quality measure to select the final classification. Our experiment results show that the proposed algorithm is effective and feasible.

Author supplied keywords

Cite

CITATION STYLE

APA

Liu, Y., Wen, Y., Yuan, D., & Cuan, Y. (2014). A huffman tree-based algorithm for clustering documents. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8933, 630–640. https://doi.org/10.1007/978-3-319-14717-8_49

A huffman tree-based algorithm for clustering documents

Abstract

Author supplied keywords

Cite

Register to see more suggestions