Multi-label feature ranking with ensemble methods

7Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we propose three ensemble-based feature ranking scores for multi-label classification (MLC), which is a generalisation of multi-class classification where the classes are not mutually exclusive. Each of the scores (Symbolic, Genie3 and Random forest) can be computed from three different ensembles of predictive clustering trees: Bagging, Random forest and Extra trees. We extensively evaluate the proposed scores on 24 benchmark MLC problems, using 15 standard MLC evaluation measures. We determine the ranking quality saturation points in terms of the ensemble sizes, for each ranking-ensemble pair, and show that quality rankings can be computed really efficiently (typically 10 or 50 trees suffice). We also show that the proposed feature rankings are relevant and determine the most appropriate ensemble method for every feature ranking score. We empirically prove that the proposed feature ranking scores outperform current state-of-the-art methods in the quality of the rankings (for the majority of the evaluation measures), and in time efficiency. Finally, we determine the best performing feature ranking scores. Taking into account the quality of the rankings first and—in the case of ties—time efficiency, we identify the Genie3 feature ranking score as the optimal one.

Cite

CITATION STYLE

APA

Petković, M., Džeroski, S., & Kocev, D. (2020). Multi-label feature ranking with ensemble methods. Machine Learning, 109(11), 2141–2159. https://doi.org/10.1007/s10994-020-05908-1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free