A semi-supervised multi-label classification framework with feature reduction and enrichment

9Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Multi-label classification (MLC) has drawn much attention thanks to its usefulness and omnipresence in real-world applications in which objects may be characterized by more than one label as in the traditional approach. Getting multi-label examples is costly and time-consuming; therefore, semi-supervised learning approach should be considered to take advantages of both labelled and unlabelled data. In this work, we propose a semi-supervised MLC algorithm exploiting the specific features of the prominent class label(s) chosen by a greedy approach as an extension of the LIFT algorithm, and unlabelled data consumption mechanism from the TESC algorithm. We also make a semi-supervised MLC application framework for Vietnamese texts with several feature enrichment steps including (a) a stage of enriching features by adding hidden topic features and (b) a stage of dimensional reduction for subtracting irrelevant features. Experimental results on a dataset of hotel reviews (for tourism) indicate that a reasonable amount of unlabelled data helps to increase the F1 score. Interestingly, with a small amount of labelled data, our algorithm can reach a comparative performance to the case of using a larger amount of labelled data.

Cite

CITATION STYLE

APA

Pham, T. N., Nguyen, V. Q., Tran, V. H., Nguyen, T. T., & Ha, Q. T. (2017). A semi-supervised multi-label classification framework with feature reduction and enrichment. Journal of Information and Telecommunication, 1(4), 305–318. https://doi.org/10.1080/24751839.2017.1364925

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free