Extraction of bilingual technical terms for chinese-japanese patent translation

4Citations
Citations of this article
70Readers
Mendeley users who have this article in their library.

Abstract

The translation of patents or scientific papers is a key issue that should be helped by the use of statistical machine translation (SMT). In this paper, we propose a method to improve Chinese-Japanese patent SMT by premarking the training corpus with aligned bilingual multi-word terms. We automatically extract multi-word terms from monolingual corpora by combining statistical and linguistic filtering methods. We use the sampling-based alignment method to identify aligned terms and set some threshold on translation probabilities to select the most promising bilingual multi-word terms. We pre-mark a Chinese- Japanese training corpus with such selected aligned bilingual multi-word terms. We obtain the performance of over 70% precision in bilingual term extraction and a significant improvement of BLEU scores in our experiments on a Chinese-Japanese patent parallel corpus.

Cite

CITATION STYLE

APA

Yang, W., Yan, J., & Lepage, Y. (2016). Extraction of bilingual technical terms for chinese-japanese patent translation. In HLT-NAACL 2016 - 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Student Research Workshop (pp. 81–87). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/n16-2012

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free