Attention Word Embedding

6Citations
Citations of this article
88Readers
Mendeley users who have this article in their library.

Abstract

Word embedding models learn semantically rich vector representations of words and are widely used to initialize natural processing language (NLP) models. The popular continuous bag-of-words (CBOW) model of word2vec learns a vector embedding by masking a given word in a sentence and then using the other words as a context to predict it. A limitation of CBOW is that it equally weights the context words when making a prediction, which is inefficient, since some words have higher predictive value than others. We tackle this inefficiency by introducing the Attention Word Embedding (AWE) model, which integrates the attention mechanism into the CBOW model. We also propose AWE-S, which incorporates subword information. We demonstrate that AWE and AWE-S outperform the state-of-the-art word embedding models both on a variety of word similarity datasets and when used for initialization of NLP models.

Cite

CITATION STYLE

APA

Sonkar, S., Waters, A. E., & Baraniuk, R. G. (2020). Attention Word Embedding. In COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference (pp. 6894–6902). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.coling-main.608

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free