Medical data machine translation evaluation based on dependency n-grams

1Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Machine Translation is increasingly applied to medical cross-lingual data processing. In order to evaluate the quality of machine translation, automatic evaluation approaches like BLEU and NIST, most of which are n-gram based metrics, are widely used besides costly human evaluation. Current evaluation approaches merely make surface linguistic comparisons between the candidate and reference translations. Furthermore the domain features such medical terms, dependent and cohesive relations in and among sentences in medical documents should be taken into account when evaluating translations. However severe noises are imported into the procedure when faulty machine translations are parsed using the current syntactic parsers. Therefore using the noisy parsing results to compare with references affects the improvement of evaluation even though the deep processing is incorporated. To lessen noises as well as grasp the main meaning of a sentence, the paper proposes to extract the dependency n-grams only based on dependency parsing of reference translations. Dependency n-grams are stemmed and extended according to linguistic rules and then viewed as the key points for quality evaluation. The score of candidate translation is computed according to the count of dependency n-grams loose matching. Also the penalty of short translation and the clip count of the highest frequency of dependency n-grams are incorporated in the final score of the candidate. Experiments on our translation datasets show the evaluation based on dependency n-grams significantly outperforms the metric of BLEU and NSIT. This approach is also significantly better than the related research which employs dependency relation parsing in evaluation.

Cite

CITATION STYLE

APA

Qin, Y., & Liang, Y. (2017). Medical data machine translation evaluation based on dependency n-grams. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10219 LNCS, pp. 174–181). Springer Verlag. https://doi.org/10.1007/978-3-319-59858-1_17

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free