A refinement framework for cross language text categorization

Ke Wu; Bao Liang Lu

Conference Proceedings

A refinement framework for cross language text categorization

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2008) 4993 LNCS 401-411

DOI: 10.1007/978-3-540-68636-1_39

4Citations

1Readers

Get full text

Abstract

Cross language text categorization is the task of exploiting labelled documents in a source language (e.g. English) to classify documents in a target language (e.g. Chinese). In this paper, we focus on investigating the use of a bilingual lexicon for cross language text categorization. To this end, we propose a novel refinement framework for cross language text categorization. The framework consists of two stages. In the first stage, a cross language model transfer is proposed to generate initial labels of documents in target language. In the second stage, expectation maximization algorithm based on naive Bayes model is introduced to yield resulting labels of documents. Preliminary experimental results on collected corpora show that the proposed framework is effective. © 2008 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Wu, K., & Lu, B. L. (2008). A refinement framework for cross language text categorization. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4993 LNCS, pp. 401–411). https://doi.org/10.1007/978-3-540-68636-1_39

A refinement framework for cross language text categorization

Abstract

Cite

Register to see more suggestions