Task-Based MoE for Multitask Multilingual Machine Translation

1Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

Abstract

Mixture-of-experts (MoE) architecture has been proven a powerful method for diverse tasks in training deep models in many applications. However, current MoE implementations are task agnostic, treating all tokens from different tasks in the same manner. In this work, we instead design a novel method that incorporates task information into MoE models at different granular levels with shared dynamic task-based adapters. Our experiments and analysis show the advantages of our approaches over the dense and canonical MoE models on multitask multilingual machine translations. With task-specific adapters, our models can additionally generalize to new tasks efficiently.

Cite

CITATION STYLE

APA

Pham, H., Kim, Y. J., Mukherjee, S., Woodruff, D. P., Póczos, B., & Awadalla, H. H. (2023). Task-Based MoE for Multitask Multilingual Machine Translation. In MRL 2023 - 3rd Workshop on Multi-Lingual Representation Learning, Proceedings of the Workshop (pp. 268–281). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.mrl-1.13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free