Otem&Utem: Over- and Under-Translation Evaluation Metric for NMT

Jing Yang; Biao Zhang; Yue Qin; Xiangwen Zhang; Qian Lin; Jinsong Su

Conference Proceedings

Otem&Utem: Over- and Under-Translation Evaluation Metric for NMT

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11108 LNAI 291-302

DOI: 10.1007/978-3-319-99495-6_25

7Citations

11Readers

Get full text

Abstract

Although neural machine translation (NMT) yields promising translation performance, it unfortunately suffers from over- and under-translation issues [31], of which studies have become research hotspots in NMT. At present, these studies mainly apply the dominant automatic evaluation metrics, such as BLEU, to evaluate the overall translation quality with respect to both adequacy and fluency. However, they are unable to accurately measure the ability of NMT systems in dealing with the above-mentioned issues. In this paper, we propose two quantitative metrics, the Otem and Utem, to automatically evaluate the system performance in terms of over- and under-translation respectively. Both metrics are based on the proportion of mismatched n-grams between gold reference and system translation. We evaluate both metrics by comparing their scores with human evaluations, where the values of Pearson Correlation Coefficient reveal their strong correlation. Moreover, in-depth analyses on various translation systems indicate some inconsistency between BLEU and our proposed metrics, highlighting the necessity and significance of our metrics.

Author supplied keywords

Cite

CITATION STYLE

APA

Yang, J., Zhang, B., Qin, Y., Zhang, X., Lin, Q., & Su, J. (2018). Otem&Utem: Over- and Under-Translation Evaluation Metric for NMT. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11108 LNAI, pp. 291–302). Springer Verlag. https://doi.org/10.1007/978-3-319-99495-6_25

Otem&Utem: Over- and Under-Translation Evaluation Metric for NMT

Abstract

Author supplied keywords

Cite

Register to see more suggestions