Clustering methods for moderate-to-high dimensionality data

0Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Traditional clustering methods are usually inefficient and ineffective over data with more than five or so dimensions. In Sect. 2.3 of the previous chapter, we discuss the main reasons that lead to this fact. It is also mentioned that the use of dimensionality reduction methods does not solve the problem, since it allows one to treat only the global correlations in the data. Correlations local to subsets of the data cannot be identified without the prior identification of the data clusters where they occur. Thus, algorithms that combine dimensionality reduction and clustering into a single task have been developed to look for clusters together with the subspaces of the original space where they exist. Some of these algorithms are briefly described in this chapter. Specifically, we first present a concise survey on the existing algorithms, and later we discuss three of the most relevant ones. Then, in order to help one to evaluate and to compare the algorithms, we conclude the chapter by presenting a table to link some of the most relevant techniques with the main desirable properties that any clustering technique for moderate-to-high dimensionality data should have. The general goal is to identify the main strategies already used to deal with the problem, besides the key limitations of the existing techniques.

Cite

CITATION STYLE

APA

Cordeiro, R. L. F., Faloutsos, C., & Traina Júnior, C. (2013). Clustering methods for moderate-to-high dimensionality data. In SpringerBriefs in Computer Science (Vol. 0, pp. 21–32). Springer. https://doi.org/10.1007/978-1-4471-4890-6_3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free