Text representation in multi-label classification: Two new input representations

Rodrigo Alfaro; Héctor Allende

Conference Proceedings

Text representation in multi-label classification: Two new input representations

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 6594 LNCS(PART 2) 61-70

DOI: 10.1007/978-3-642-20267-4_7

4Citations

9Readers

Get full text

Abstract

Automatic text classification is the task of assigning unseen documents to a predefined set of classes. Text representation for classification purposes has been traditionally approached using a vector space model due to its simplicity and good performance. On the other hand, multi-label automatic text classification has been typically addressed either by transforming the problem under study to apply binary techniques or by adapting binary algorithms to work with multiple labels. In this paper we present two new representations for text documents based on label-dependent term-weighting for multi-label classification. We focus on modifying the input. Performance was tested with a well-known dataset and compared to alternative techniques. Experimental results based on Hamming loss analysis show an improvement against alternative approaches. © 2011 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Alfaro, R., & Allende, H. (2011). Text representation in multi-label classification: Two new input representations. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6594 LNCS, pp. 61–70). https://doi.org/10.1007/978-3-642-20267-4_7

Text representation in multi-label classification: Two new input representations

Abstract

Author supplied keywords

Cite

Register to see more suggestions