Improving Cross-lingual Text Classification with Zero-shot Instance-Weighting

5Citations
Citations of this article
59Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Cross-lingual text classification (CLTC) is a challenging task made even harder still due to the lack of labeled data in low-resource languages. In this paper, we propose zero-shot instance-weighting, a general model-agnostic zero-shot learning framework for improving CLTC by leveraging source instance weighting. It adds a module on top of pre-trained language models for similarity computation of instance weights, thus aligning each source instance to the target language. During training, the framework utilizes gradient descent that is weighted by instance weights to update parameters. We evaluate this framework over seven target languages on three fundamental tasks and show its effectiveness and extensibility, by improving on F1 score up to 4% in single-source transfer and 8% in multi-source transfer. To the best of our knowledge, our method is the first to apply instance weighting in zero-shot CLTC. It is simple yet effective and easily extensible into multi-source transfer.

Cite

CITATION STYLE

APA

Li, I., Sen, P., Zhu, H., Li, Y., & Radev, D. (2021). Improving Cross-lingual Text Classification with Zero-shot Instance-Weighting. In RepL4NLP 2021 - 6th Workshop on Representation Learning for NLP, Proceedings of the Workshop (pp. 1–7). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.repl4nlp-1.1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free