Finding the optimal vocabulary size for neural machine translation

58Citations
Citations of this article
101Readers
Mendeley users who have this article in their library.

Abstract

We cast neural machine translation (NMT) as a classification task in an autoregressive setting and analyze the limitations of both classification and autoregression components. Classifiers are known to perform better with balanced class distributions during training. Since the Zipfian nature of languages causes imbalanced classes, we explore its effect on NMT. We analyze the effect of various vocabulary sizes on NMT performance on multiple languages with many data sizes, and reveal an explanation for why certain vocabulary sizes are better than others.

Cite

CITATION STYLE

APA

Gowda, T., & May, J. (2020). Finding the optimal vocabulary size for neural machine translation. In Findings of the Association for Computational Linguistics Findings of ACL: EMNLP 2020 (pp. 3955–3964). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.findings-emnlp.352

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free