Learning word representations with regularization from prior knowledge

35Citations
Citations of this article
80Readers
Mendeley users who have this article in their library.

Abstract

Conventional word embeddings are trained with specific criteria (e.g., based on language modeling or co-occurrence) inside a single information source, disregarding the opportunity for further calibration using external knowledge. This paper presents a unified framework that leverages pre-learned or external priors, in the form of a regularizer, for enhancing conventional language model-based embedding learning. We consider two types of regularizers. The first type is derived from topic distribution by running latent Dirichlet allocation on unlabeled data. The second type is based on dictionaries that are created with human annotation efforts. To effectively learn with the regularizers, we propose a novel data structure, trajectory softmax, in this paper. The resulting embeddings are evaluated by word similarity and sentiment classification. Experimental results show that our learning framework with regularization from prior knowledge improves embedding quality across multiple datasets, compared to a diverse collection of baseline methods.

References Powered by Scopus

GloVe: Global vectors for word representation

26880Citations
11177Readers
Get full text
11662Citations
2964Readers
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Song, Y., Lee, C. J., & Xia, F. (2017). Learning word representations with regularization from prior knowledge. In CoNLL 2017 - 21st Conference on Computational Natural Language Learning, Proceedings (pp. 143–152). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/k17-1016

Readers over time

‘17‘18‘19‘20‘21‘22‘23‘24‘2506121824

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 23

68%

Researcher 6

18%

Lecturer / Post doc 3

9%

Professor / Associate Prof. 2

6%

Readers' Discipline

Tooltip

Computer Science 26

72%

Linguistics 5

14%

Engineering 3

8%

Agricultural and Biological Sciences 2

6%

Save time finding and organizing research with Mendeley

Sign up for free
0