Dynamically composing domain-data selection with clean-data selection by “co-curricular learning” for neural machine translation

35Citations
Citations of this article
162Readers
Mendeley users who have this article in their library.

Abstract

Noise and domain are important aspects of data quality for neural machine translation. Existing research focus separately on domain-data selection, clean-data selection, or their static combination, leaving the dynamic interaction across them not explicitly examined. This paper introduces a “co-curricular learning” method to compose dynamic domain-data selection with dynamic clean-data selection, for transfer learning across both capabilities. We apply an EM-style optimization procedure to further refine the “co-curriculum”. Experiment results and analysis with two domains demonstrate the effectiveness of the method and the properties of data scheduled by the co-curriculum.

Cite

CITATION STYLE

APA

Wang, W., Caswell, I., & Chelba, C. (2020). Dynamically composing domain-data selection with clean-data selection by “co-curricular learning” for neural machine translation. In ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (pp. 1282–1292). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p19-1123

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free