Text documents clustering using data mining techniques

48Citations
Citations of this article
122Readers
Mendeley users who have this article in their library.

Abstract

Increasing progress in numerous research fields and information technologies, led to an increase in the publication of research papers. Therefore, researchers take a lot of time to find interesting research papers that are close to their field of specialization. Consequently, in this paper we have proposed documents classification approach that can cluster the text documents of research papers into the meaningful categories in which contain a similar scientific field. Our presented approach based on essential focus and scopes of the target categories, where each of these categories includes many topics. Accordingly, we extract word tokens from these topics that relate to a specific category, separately. The frequency of word tokens in documents impacts on weight of document that calculated by using a numerical statistic of term frequency-inverse document frequency (TF-IDF). The proposed approach uses title, abstract, and keywords of the paper, in addition to the categories topics to perform the classification process. Subsequently, documents are classified and clustered into the primary categories based on the highest measure of cosine similarity between category weight and documents weights.

References Powered by Scopus

253Citations
485Readers

This article is free to access.

Document clustering: TF-IDF approach

184Citations
198Readers
Get full text
140Citations
373Readers
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Jalal, A. A., & Ali, B. H. (2021). Text documents clustering using data mining techniques. International Journal of Electrical and Computer Engineering, 11(1), 664–670. https://doi.org/10.11591/ijece.v11i1.pp664-670

Readers over time

‘20‘21‘22‘23‘24‘2509182736

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 26

52%

Lecturer / Post doc 22

44%

Researcher 2

4%

Readers' Discipline

Tooltip

Computer Science 32

73%

Engineering 7

16%

Mathematics 3

7%

Decision Sciences 2

5%

Save time finding and organizing research with Mendeley

Sign up for free
0