Abstract
The K-nearest neighbors (kNN) is a lazy-learning method for classification and regression that has been successfully applied to several application domains. It is simple and directly applicable to multi-class problems however it suffers a high complexity in terms of both memory and computations. Several research studies try to scale the kNN method to very large datasets using crisp partitioning. In this paper, we propose to integrate the principles of rough sets and fuzzy sets while conducting a clustering algorithm to separate the whole dataset into several parts, each of which is then conducted kNN classification. The concept of crisp lower bound and fuzzy boundary of a cluster which is applied to the proposed algorithm allows accurate selection of the set of data points to be involved in classifying an unseen data point. The data points to be used are a mix of core and border data points of the clusters created in the training phase. The experimental results on standard datasets show that the proposed kNN classification is more effective than related recent work with a slight increase in classification time.
Cite
CITATION STYLE
Mahfouz, M. (2018). RFKNN: ROUGH-FUZZY KNN FOR BIG DATA CLASSIFICATION. International Journal of Advanced Research in Computer Science, 9(2), 274–279. https://doi.org/10.26483/ijarcs.v9i2.5667
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.