Abstract
We present a simple algorithm to efficiently train language models with noise-contrastive estimation (NCE) on graphics processing units (GPUs). Our NCE-trained language models achieve significantly lower perplexity on the One Billion Word Benchmark language modeling challenge, and contain one sixth of the parameters in the best single model in Chelba et al. (2013). When incorporated into a strong Arabic-English machine translation system they give a strong boost in translation quality. We release a toolkit so that others may also train large-scale, large vocabulary LSTM language models with NCE, parallelizing computation across multiple GPUs.
Cite
CITATION STYLE
Zoph, B., Vaswani, A., May, J., & Knight, K. (2016). Simple, fast noise-contrastive estimation for large RNN vocabularies. In 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference (pp. 1217–1222). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/n16-1145
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.