Abstract
Existing metrics to evaluate the quality of Machine Translation hypotheses take different perspectives into account. DPMFcomb, a metric combining the merits of a range of metrics, achieved the best performance for evaluation of to-English language pairs in the previous two years of WMT Metrics Shared Tasks. This year, we submit a novel combined metric, Blend, to WMT17 Metrics task. Compared to DPMFcomb, Blend includes the following adaptations: i) We use DA human evaluation to guide the training process with a vast reduction in required training data, while still achieving improved performance when evaluated on WMT16 to-English language pairs; ii) We carry out experiments to explore the contribution of metrics incorporated in Blend, in order to find a trade-off between performance and efficiency.
Cite
CITATION STYLE
Ma, Q., Graham, Y., Wang, S., & Liu, Q. (2017). Blend: A novel combined MT metric based on direct assessment - CASICT-DCU submission to WMT17 metrics task. In WMT 2017 - 2nd Conference on Machine Translation, Proceedings (pp. 598–603). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w17-4768
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.