The neural network joint model (NNJM), which augments the neural network lan-guage model (NNLM) with an m-word source context window, has achieved large gains in machine translation accuracy, but also has problems with high normalization cost when using large vocabularies. Train-ing the NNJM with noise-contrastive es-timation (NCE), instead of standard max-imum likelihood estimation (MLE), can reduce computation cost. In this paper, we propose an alternative to NCE, the bi-narized NNJM (BNNJM), which learns a binary classifier that takes both the con-text and target words as input, and can be efficiently trained using MLE. We com-pare the BNNJM and NNJM trained by NCE on Chinese-to-English and Japanese-to-English translation tasks.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below