Terminological consistency is an essential requirement for industrial translation. High-quality, hand-crafted terminologies contain entries in their nominal forms. Integrating such a terminology into machine translation is not a trivial task. The MT system must be able to disambiguate homographs on the source side and choose the correct wordform on the target side. In this work, we propose a simple but effective method for homograph disambiguation and a method of wordform selection by introducing multi-choice lexical constraints. We also propose a metric to measure the terminological consistency of the translation. Our results have a significant improvement over the current SOTA in terms of terminological consistency without any loss of the BLEU score. All the code used in this work will be published as open-source.
CITATION STYLE
Öz, O., & Sukhareva, M. (2021). Towards Precise Lexicon Integration in Neural Machine Translation. In International Conference Recent Advances in Natural Language Processing, RANLP (pp. 1084–1095). Incoma Ltd. https://doi.org/10.26615/978-954-452-072-4_122
Mendeley helps you to discover research relevant for your work.