A New Wasserstein Based Distance for the Hierarchical Clustering of Histogram Symbolic Data

  • Irpino A
  • Verde R
N/ACitations
Citations of this article
26Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Symbolic Data Analysis (SDA) aims to to describe and analyze complex and structured data extracted, for example, from large databases. Such data, which can be expressed as concepts, are modeled by symbolic objects described by multivalued variables. In the present paper we present a new distance, based on the Wasserstein metric, in order to cluster a set of data described by distributions with finite continue support, or, as called in SDA, by "histograms". The proposed distance permits us to define a measure of inertia of data with respect to a barycenter that satisfies the Huygens theorem of decomposition of inertia. We propose to use this measure for an agglomerative hierarchical clustering of histogram data based on the Ward criterion. An application to real data validates the procedure.

Cite

CITATION STYLE

APA

Irpino, A., & Verde, R. (2006). A New Wasserstein Based Distance for the Hierarchical Clustering of Histogram Symbolic Data. In Data Science and Classification (pp. 185–192). Springer Berlin Heidelberg. https://doi.org/10.1007/3-540-34416-0_20

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free