Incrementally learning the hierarchical softmax function for neural language models

Hao Peng; Jianxin Li; Yangqiu Song; Yaopeng Liu

Conference ProceedingsOPEN ACCESS

Incrementally learning the hierarchical softmax function for neural language models

31st AAAI Conference on Artificial Intelligence, AAAI 2017 (2017) 3267-3273

DOI: 10.1609/aaai.v31i1.10994

65Citations

65Readers

Abstract

Neural network language models (NNLMs) have attracted a lot of attention recently. In this paper, we present a training method that can incrementally train the hierarchical softmax function for NNMLs. We split the cost function to model old and update corpora separately, and factorize the objective function for the hierarchical softmax. Then we provide a new stochastic gradient based method to update all the word vectors and parameters, by comparing the old tree generated based on the old corpus and the new tree generated based on the combined (old and update) corpus. Theoretical analysis shows that the mean square error of the parameter vectors can be bounded by a function of the number of changed words related to the parameter node. Experimental results show that incremental training can save a lot of time. The smaller the update corpus is, the faster the update training process is, where an up to 30 times speedup has been achieved. We also use both word similarity/relatedness tasks and dependency parsing task as our benchmarks to evaluate the correctness of the updated word vectors.

Cite

CITATION STYLE

APA

Peng, H., Li, J., Song, Y., & Liu, Y. (2017). Incrementally learning the hierarchical softmax function for neural language models. In 31st AAAI Conference on Artificial Intelligence, AAAI 2017 (pp. 3267–3273). AAAI press. https://doi.org/10.1609/aaai.v31i1.10994

Incrementally learning the hierarchical softmax function for neural language models

Abstract

Cite

Register to see more suggestions