The k-means algorithm is undoubtedly the most widely used data clustering algorithm due to its relative simplicity. It can only handle data that are linearly separable. A generalization of k-means is kernel k-means, which can handle data that are not linearly separable. Standard k-means and kernel k-means have the same disadvantage of being sensitive to the initial placement of the cluster centers. A novel kernel k-means algorithm is proposed in the paper. The proposed algorithm uses a graph based kernel matrix and finds k data points as initial centers for kernel k-means. Since finding the optimal data points as initial centers is an NP-hard problem, this problem is relaxed to obtain k representative data points as initial centers. Matching pursuit algorithm for multiple vectors is used to greedily find k representative data points. The proposed algorithm is tested on synthetic and real-world datasets and compared with kernel k-means algorithms using other initialization techniques. Our empirical study shows encouraging results of the proposed algorithm.
CITATION STYLE
Yang, W., & Tang, L. (2015). Graph based kernel k-means using representative data points as initial centers. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9225, pp. 293–304). Springer Verlag. https://doi.org/10.1007/978-3-319-22180-9_29
Mendeley helps you to discover research relevant for your work.