Similarity measures for binary variables are used in many problems of machine learning, pattern recognition and classification. Currently, the dozens of similarity measures are introduced and the problem of comparative analysis of these measures appears. One of the methods used for such analysis is clustering of similarity measures based on correlation between data similarity values obtained by different measures. The paper proposes the method of comparative analysis of similarity measures based on the set theoretic representation of these measures and comparison of algebraic properties of these representations. The results show existing relationship between results of clustering and the classification of measures by their properties. Due to the results of clustering depend on the clustering method and on data used for measuring correlation between measures we conclude that the classification based on the proposed properties of similarity measures is more suitable for comparative analysis of similarity measures.
CITATION STYLE
Mejia, I. R., & Batyrshin, I. (2018). Towards a classification of binary similarity measures. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10632 LNAI, pp. 325–335). Springer Verlag. https://doi.org/10.1007/978-3-030-02837-4_27
Mendeley helps you to discover research relevant for your work.