Abstract
Hierarchical multi-label classification (HMLC) is a variant of classifi- cation where instances may belong to multiple classes that are organized in a hierarchy. The approach we used is based on decision trees and is set in the predictive clustering trees framework (PCTs), which is implemented in the CLUS system. In this work, we are investigating how different distance measures for hierarchies influence the predictive performance of the PCTs. The distance measures that we consider include weghted Euclidean distance, Jaccard, SimGIC and ImageCLEF distance. We use datasets from the area of functional genomics to evaluate the performance of the PCTs with different distances. The datasets describe different functions of the genes in the genomes of two well-studied organisms: S. Cerevisiae and A. Thaliana. We use precision-recall curves as an evaluation metric for the predictive performance. The results from the Friedman test for statistical significance suggest that there is no statistical significance in the performance.
Cite
CITATION STYLE
Aleksovski, D., Kocev, D., & Dzeroski, S. (2009). Learning From Multi-label Data. Ecml-Pkdd, 5–15. Retrieved from http://www.ecmlpkdd2009.net/wp-content/uploads/2008/09/learning-from-multi-label-data.pdf#page=6
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.