Clustering of concept-drift categorical data implementation in JAVA

K. Reddy Madhavi; A. Vinaya Babu; S. Viswanadha Raju

Conference Proceedings

Clustering of concept-drift categorical data implementation in JAVA

Communications in Computer and Information Science (2012) 270 CCIS(PART II) 639-654

DOI: 10.1007/978-3-642-29216-3_70

0Citations

2Readers

Get full text

Abstract

Identification of useful clusters in large datasets has attracted considerable interest in clustering process. Clustering categorical data is a hard choice when compared to the numerical data, because the similarity measures in the traditional clustering algorithms uses distances between points to generate clusters that are not appropriate for Boolean and categorical attributes. Since data in the World Wide Web is increasing exponentially that affects on clustering accuracy and decision making, change in the concept between every cluster occurs named concept drift. To detect the difference of cluster distributions between the current data subset and previous clustering result, an algorithm called Drifting Concept Detection(DCD) which uses sliding window and node importance has been presented and implemented in JAVA language by considering "usenet" dataset in which every data point is the message and the node is the word. Hence it is challenging in the problem of clustering concept-drift categorical data. In this paper, few concepts have been implemented to produce the appropriate clustering results by minimizing the clustering process as the time evolving data comes into the sliding window every time that minimizes I/O costs and number of concept drifts decreases if sliding window size increases. © 2012 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Reddy Madhavi, K., Vinaya Babu, A., & Viswanadha Raju, S. (2012). Clustering of concept-drift categorical data implementation in JAVA. In Communications in Computer and Information Science (Vol. 270 CCIS, pp. 639–654). https://doi.org/10.1007/978-3-642-29216-3_70

Clustering of concept-drift categorical data implementation in JAVA

Abstract

Author supplied keywords

Cite

Register to see more suggestions