Clustering is a machine intelligence which aimed at grouping a set of objects into Subsets or clusters. Clustering text documents into various classifications is a vital advance in indexing, recovery, administration and removal of abundant text data on the Web. In research and development to prove that a new clustering algorithm is efficient, one needs to compare the existing algorithm with the new technique, for which the standard datasets are required. In this paper we have pre-processed the datasets to a standardized format, with an expansion of houses appropriate for a wide range of clustering and related experiments. Our objective is to set up a benchmark document datasets and extract the parts of speech such as verbs, nouns, adverbs, adjectives and etc from the documents of a given dataset and analyze the impact of parts of speech in clustering process.
CITATION STYLE
Sri Lalitha, Y., Sirisha Devi, J., Ledalla, S., & Ganapathi Raju, N. V. (2019). Analysis of parts of speech tagging on text clustering. International Journal of Innovative Technology and Exploring Engineering, 8(8), 2287–2291.
Mendeley helps you to discover research relevant for your work.