Categorical data has always posed a challenge in data analysis through clustering. With the increasing awareness about Big data analysis, the need for better clustering methods for categorical data and mixed data has arisen. The prevailing clustering algorithms are not suitable for clustering categorical data majorly because the distance functions used for continuous data are not applicable for categorical data. Recent research focuses on several different approaches for clustering categorical data. However, the complexity of methods makes them unsuitable for use in big data. Emphasis should be on algorithms which are faster. Thus paper proposes a simple, fast method derived from statistics for clustering categorical data. Results on popular datasets are encouraging.
CITATION STYLE
Khandelwal, G., & Sharma, R. (2015). A Simple Yet Fast Clustering Approach for Categorical Data. International Journal of Computer Applications, 120(17), 25–30. https://doi.org/10.5120/21321-4341
Mendeley helps you to discover research relevant for your work.