T3L: Translate-and-Test Transfer Learning for Cross-Lingual Text Classification

13Citations
Citations of this article
24Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Cross-lingual text classification leverages text classifiers trained in a high-resource language to perform text classification in other languages with no or minimal fine-tuning (zero/ few-shots cross-lingual transfer). Nowadays, cross-lingual text classifiers are typically built on large-scale, multilingual language models (LMs) pretrained on a variety of languages of interest. However, the performance of these models varies significantly across languages and classification tasks, suggesting that the superposition of the language modelling and classification tasks is not always effective. For this reason, in this paper we propose revis-iting the classic ‘‘translate-and-test’’ pipeline to neatly separate the translation and classification stages. The proposed approach couples 1) a neural machine translator translating from the targeted language to a high-resource lan-guage, with 2) a text classifier trained in the high-resource language, but the neural machine translator generates ‘‘soft’’ translations to permit end-to-end backpropagation during fine-tuning of the pipeline. Extensive experi-ments have been carried out over three cross-lingual text classification datasets (XNLI, MLDoc, and MultiEURLEX), with the results showing that the proposed approach has significantly improved performance over a com-petitive baseline.

Cite

CITATION STYLE

APA

Unanue, I. J., Haffari, G., & Piccardi, M. (2023). T3L: Translate-and-Test Transfer Learning for Cross-Lingual Text Classification. Transactions of the Association for Computational Linguistics, 11, 1147–1161. https://doi.org/10.1162/tacl_a_00593

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free