A novel weighting scheme applied to improve the text document clustering techniques

Laith Mohammad Abualigah; Ahamad Tajudin Khader; Essam Said Hanandeh

Conference Proceedings

A novel weighting scheme applied to improve the text document clustering techniques

Studies in Computational Intelligence (2018) 741 305-320

DOI: 10.1007/978-3-319-66984-7_18

41Citations

23Readers

Get full text

Abstract

Text clustering is an efficient analysis technique used in the domain of the text mining to arrange a huge of unorganized text documents into a subset of coherent clusters. Where, the similar documents in the same cluster. In this paper, we proposed a novel term weighting scheme, namely, length feature weight (LFW), to improve the text document clustering algorithms based on new factors. The proposed scheme assigns a favorable term weight according to the obtained information from the documents collection. It recognizes the terms which are particular to each cluster and enhances their weights based on the proposed factors at the level of the document. β-hill climbing technique is used to validate the proposed scheme in the text clustering. The proposed weight scheme is compared with the existing weight scheme (TF-IDF) to validate its results in that domain. Experiments are conducted on eight standard benchmark text datasets taken from the Laboratory of Computational Intelligence (LABIC). The results proved that the proposed weighting scheme LFW overcomes the existing weighting scheme and enhances the result of text document clustering technique in terms of the F-measure, precision, and recall.

Author supplied keywords

Cite

CITATION STYLE

APA

Abualigah, L. M., Khader, A. T., & Hanandeh, E. S. (2018). A novel weighting scheme applied to improve the text document clustering techniques. In Studies in Computational Intelligence (Vol. 741, pp. 305–320). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-319-66984-7_18

A novel weighting scheme applied to improve the text document clustering techniques

Abstract

Author supplied keywords

Cite

Register to see more suggestions