MEANT 2.0: Accurate semantic MT evaluation for any output language

39Citations
Citations of this article
73Readers
Mendeley users who have this article in their library.

Abstract

We describe a new version of MEANT, which participated in the metrics task of the Second Conference on Machine Translation (WMT 2017). MEANT 2.0 uses idf-weighted distributional ngram accuracy to determine the phrasal similarity of semantic role fillers and yields better correlations with human judgments of translation quality than earlier versions. The improved phrasal similarity enables a subversion of MEANT to accurately evaluate translation adequacy for any output language, even languages without an automatic semantic parser. Our results show that MEANT, which is a non-ensemble and untrained metric, consistently performs as well as the top participants in previous years - including ensemble and trained ones - across different output languages. We also present the timing statistics for MEANT for better estimation of the evaluation cost. MEANT 2.0 is open source and publicly available.

Cite

CITATION STYLE

APA

Lo, C. K. (2017). MEANT 2.0: Accurate semantic MT evaluation for any output language. In WMT 2017 - 2nd Conference on Machine Translation, Proceedings (pp. 589–597). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w17-4767

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free