Scheduled Multi-task Learning for Neural Chat Translation

11Citations
Citations of this article
42Readers
Mendeley users who have this article in their library.

Abstract

Neural Chat Translation (NCT) aims to translate conversational text into different languages. Existing methods mainly focus on modeling the bilingual dialogue characteristics (e.g., coherence) to improve chat translation via multi-task learning on small-scale chat translation data. Although the NCT models have achieved impressive success, it is still far from satisfactory due to insufficient chat translation data and simple joint training manners. To address the above issues, we propose a scheduled multi-task learning framework for NCT. Specifically, we devise a three-stage training framework to incorporate the large-scale in-domain chat translation data into training by adding a second pre-training stage between the original pre-training and fine-tuning stages. Further, we investigate where and how to schedule the dialogue-related auxiliary tasks in multiple training stages to effectively enhance the main chat translation task. Extensive experiments on four language directions (English?Chinese and English?German) verify the effectiveness and superiority of the proposed approach. Additionally, we will make the large-scale in-domain paired bilingual dialogue dataset publicly available for the research community.

Cite

CITATION STYLE

APA

Liang, Y., Meng, F., Xu, J., Chen, Y., & Zhou, J. (2022). Scheduled Multi-task Learning for Neural Chat Translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 4375–4388). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.acl-long.300

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free