Using nearest neighbor information to improve cross-language text classification

1Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Cross-language text classification (CLTC) aims to take advantage of existing training data from one language to construct a classifier for another language. In addition to the expected translation issues, CLTC is also complicated by the cultural distance between both languages, which causes that documents belonging to the same category concern very different topics. This paper proposes a re-classification method which purpose is to reduce the errors caused by this phenomenon by considering information from the own target language documents. Experimental results in a news corpus considering three pairs of languages and four categories demonstrated the appropriateness of the proposed method, which could improve the initial classification accuracy by up to 11%. © 2009 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Escobar-Acevedo, A., Montes-Y-Gómez, M., & Villaseñor-Pineda, L. (2009). Using nearest neighbor information to improve cross-language text classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5845 LNAI, pp. 157–164). https://doi.org/10.1007/978-3-642-05258-3_14

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free