Knowledge-based representation for transductive multilingual document classification

Salvatore Romeo; Dino Ienco; Andrea Tagarelli

Conference Proceedings

Knowledge-based representation for transductive multilingual document classification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9022 92-103

DOI: 10.1007/978-3-319-16354-3_11

8Citations

12Readers

Get full text

Abstract

Multilingual document classification is often addressed by approaches that rely on language-specific resources (e.g., bilingual dictionaries and machine translation tools) to evaluate cross-lingual document similarities. However, the required transformations may alter the original document semantics, raising additional issues to the known difficulty of obtaining high-quality labeled datasets. To overcome such issues we propose a new framework for multilingual document classification under a transductive learning setting. We exploit a large-scale multilingual knowledge base, BabelNet, to support the modeling of different language-written documents into a common conceptual space, without requiring any language translation process. We resort to a state-of-theart transductive learner to produce the document classification. Results on two real-world multilingual corpora have highlighted the effectiveness of the proposed document model w.r.t. document representations usually involved in multilingual and cross-lingual analysis, and the robustness of the transductive setting for multilingual document classification.

Cite

CITATION STYLE

APA

Romeo, S., Ienco, D., & Tagarelli, A. (2015). Knowledge-based representation for transductive multilingual document classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9022, pp. 92–103). Springer Verlag. https://doi.org/10.1007/978-3-319-16354-3_11

Knowledge-based representation for transductive multilingual document classification

Abstract

Cite

Register to see more suggestions