Development of document clustering technique for gurmukhi script using fuzzy term weight

Mukesh Kumar; Amandeep Verma

Journal ArticleOPEN ACCESS

Development of document clustering technique for gurmukhi script using fuzzy term weight

International Journal of Recent Technology and Engineering (2019) 8(2) 1646-1653

DOI: 10.35940/ijrte.B2386.078219

0Citations

3Readers

Get full text

Abstract

Document clustering is an unsupervised machine learning technique which designates the creation of classes of a certain number of similar objects without prior knowledge of data-sets. These classes of similar objects are known as clusters; each cluster consists unlabeled data objects in such a way that data objects within the same cluster have maximum similarity and have dissimilarity to the data objects of other groups. The purpose of this research work is to develop domain independent Gurmukhi script clustering technique. It is the first ever effort as no prior work has been done to develop domain independent clustering technique for Gurmukhi script. In this paper, a hybrid algorithm for the development of document clustering technique for Gurmukhi script has been developed. The experimental results of proposed document clustering technique reveal that the proposed hybrid technique performs better in terms of defining number of clusters, creation of meaningful cluster titles, and in terms of performance regarding assignment of real time unlabeled data sets to the relevant cluster as a result of various pre-processing steps like segmentation, stemming, normalization as well as extraction of named/noun entities, creation of cluster titles and placing text documents into relevant clusters using fuzzy term weight.

Author supplied keywords

Cite

CITATION STYLE

APA

Kumar, M., & Verma, A. (2019). Development of document clustering technique for gurmukhi script using fuzzy term weight. International Journal of Recent Technology and Engineering, 8(2), 1646–1653. https://doi.org/10.35940/ijrte.B2386.078219

Development of document clustering technique for gurmukhi script using fuzzy term weight

Abstract

Author supplied keywords

Cite

Register to see more suggestions