EM-based clustering algorithm for uncertain data

Naohiko Kinoshita; Yasunori Endo

Conference Proceedings

EM-based clustering algorithm for uncertain data

Advances in Intelligent Systems and Computing (2014) 245 69-81

DOI: 10.1007/978-3-319-02821-7_8

3Citations

1Readers

Get full text

Abstract

In recent years, advanced data analysis techniques to get valuable knowledge from data using computing power of today are required. Clustering is one of the unsupervised classification technique of the data analysis. Information on a real space is transformed to data in a pattern space and analyzed in clustering. However, the data should be often represented not by a point but by a set because of uncertainty of the data, e.g., measurement error margin, data that cannot be regarded as one point, and missing values in data. These uncertainties of data have been represented as interval range and many clustering algorithms for these interval ranges of data have been constructed. However, the guideline to select an available distance in each case has not been shown so that this selection problem is difficult. Therefore, methods to calculate the dissimilarity between such uncertain data without introducing a particular distance, e.g., nearest neighbor one and so on, have been strongly desired. From this viewpoint, we proposed a concept of tolerance. The concept represents a uncertain data not as an interval but as a point with a tolerance vector. However, the distribution of uncertainty which represents the tolerance is uniform distribution and it it difficult to handle other distributions of uncertainty in the framework of tolerance, e.g., the Gaussian distribution, with HCM or FCM. In this paper, we try to construct an clustering algorithm based on the EM algorithm which handles uncertain data which are represented by the Gaussian distributions through solving the optimization problem.Moreover, effectiveness of the proposed algorithm will be verified.

Cite

CITATION STYLE

APA

Kinoshita, N., & Endo, Y. (2014). EM-based clustering algorithm for uncertain data. In Advances in Intelligent Systems and Computing (Vol. 245, pp. 69–81). Springer Verlag. https://doi.org/10.1007/978-3-319-02821-7_8

EM-based clustering algorithm for uncertain data

Abstract

Cite

Register to see more suggestions