Clustering and labeling of multi-dimensional mixed structured data

Marco Brambilla; Massimiliano Zanoni

Journal Article

Clustering and labeling of multi-dimensional mixed structured data

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7538 111-126

DOI: 10.1007/978-3-642-34213-4_8

2Citations

3Readers

Get full text

Abstract

Cluster Analysis consists of the aggregation of data items of a given set into subsets based on some similarity properties. Clustering techniques have been applied in many fields which typically involve a large amount of complex data. This study focuses on what we call multi-domain clustering and labeling, i.e. a set of techniques for multi-dimensional structured mixed data clustering. The work consists of studying the best mix of clustering techniques that address the problem in the multi-domain setting. Considered data types are numerical, categorical and textual. All of them can appear together within the same clustering scenario. We focus on k-means and agglomerative hierarchical clustering methods based on a new distance function we define for this specific setting. The proposed approach has been validated on some real and realistic data-sets based onto college, automobile and leisure fields. Experimental data allowed to evaluate the effectiveness of the different solutions, both for clustering and labeling. © Springer-Verlag Berlin Heidelberg 2012.

Cite

CITATION STYLE

APA

Brambilla, M., & Zanoni, M. (2012). Clustering and labeling of multi-dimensional mixed structured data. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7538, 111–126. https://doi.org/10.1007/978-3-642-34213-4_8

Clustering and labeling of multi-dimensional mixed structured data

Abstract

Cite

Register to see more suggestions