Knowledge transfer across multilingual corpora via latent topics

25Citations
Citations of this article
24Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper explores bridging the content of two different languages via latent topics. Specifically, we propose a unified probabilistic model to simultaneously model latent topics from bilingual corpora that discuss comparable content and use the topics as features in a cross-lingual, dictionary-less text categorization task. Experimental results on multilingual Wikipedia data show that the proposed topic model effectively discovers the topic information from the bilingual corpora, and the learned topics successfully transfer classification knowledge to other languages, for which no labeled training data are available. © 2011 Springer-Verlag.

Cite

CITATION STYLE

APA

De Smet, W., Tang, J., & Moens, M. F. (2011). Knowledge transfer across multilingual corpora via latent topics. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6634 LNAI, pp. 549–560). Springer Verlag. https://doi.org/10.1007/978-3-642-20841-6_45

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free