Unifying Cross-Lingual Transfer across Scenarios of Resource Scarcity

9Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The scarcity of data in many of the world's languages necessitates the transfer of knowledge from other, resource-rich languages. However, the level of scarcity varies significantly across multiple dimensions, including: i) the amount of task-specific data available in the source and target languages; ii) the amount of monolingual and parallel data available for both languages; and iii) the extent to which they are supported by pretrained multilingual and translation models. Prior work has largely treated these dimensions and the various techniques for dealing with them separately; in this paper, we offer a more integrated view by exploring how to deploy the arsenal of cross-lingual transfer tools across a range of scenarios, especially the most challenging, low-resource ones. To this end, we run experiments on the AmericasNLI and NusaX benchmarks over 20 languages, simulating a range of few-shot settings. The best configuration in our experiments employed parameter-efficient language and task adaptation of massively multilingual Transformers, trained simultaneously on source language data and both machine-translated and natural data for multiple target languages. In addition, we show that pre-trained translation models can be easily adapted to unseen languages, thus extending the range of our hybrid technique and translation-based transfer more broadly. Beyond new insights into the mechanisms of cross-lingual transfer, we hope our work will provide practitioners with a toolbox to integrate multiple techniques for different real-world scenarios. Our code is available at https://github.com/parovicm/unified-xlt.

Cite

CITATION STYLE

APA

Ansell, A., Parović, M., Vulić, I., Korhonen, A., & Ponti, E. M. (2023). Unifying Cross-Lingual Transfer across Scenarios of Resource Scarcity. In EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 3980–3995). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.emnlp-main.242

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free