A semantic relation preserved word embedding reuse method

2Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

When deep learning is applied to natural language processing, a word embedding layer can improve task performance significantly due to the semantic information expressed in word vectors. Word embeddings can be optimized end-to-end with the whole framework. However, considering the number of parameters in a word embedding layer, in tasks with a small corpus, the training set can easily be overfitted. To solve this problem, pretrained embeddings obtained from a much larger corpus will be utilized to boost the current model performance. This paper summarizes several methods to reuse pretrained word embeddings. In addition, as corpus topics change, new words will appear for a given task, and their corresponding embeddings cannot be obtained from pretrained vectors. Therefore, to reuse word embeddings, we propose a semantic relation preserved word embedding reuse method. The proposed method first learns word relations from the current corpus. Then, pretrained word embeddings are utilized to help generate embeddings for new observed words. Experimental results verify the effectiveness of the proposed method.

Cite

CITATION STYLE

APA

Li, X., & Zhan, D. (2020). A semantic relation preserved word embedding reuse method. Scientia Sinica Informationis, 50(6), 813–823. https://doi.org/10.1360/SSI-2019-0284

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free