The quality of machine translation (MT) is best judged by humans well versed in both source and target languages. However, automatic techniques are often used as these are much faster, cheaper and language independent. The goal of this paper is to check for correlation between manual and automatic evaluation, specifically in the context of Indian languages. To the extent automatic evaluation methods correlate with the manual evaluations, we can get the best of both worlds. In this paper, we perform a comparative study of automatic evaluation metrics—BLEU, NIST, METEOR, TER and WER, against the manual evaluation metric (adequacy), for English-Hindi translation. We also attempt to estimate the manual evaluation score of a given MT output from its automatic evaluation score. The data for the study was sourced from the Workshop on Statistical Machine Translation WMT14.
CITATION STYLE
Maurya, K. K., Ravindran, R. P., Anirudh, C. R., & Murthy, K. N. (2020). Machine Translation Evaluation: Manual Versus Automatic—A Comparative Study. In Advances in Intelligent Systems and Computing (Vol. 1079, pp. 541–553). Springer. https://doi.org/10.1007/978-981-15-1097-7_45
Mendeley helps you to discover research relevant for your work.