Understanding Word Embeddings and Language Models

Jose Manuel Gomez-Perez; Ronald Denaux; Andres Garcia-Silva

Book Chapter

Understanding Word Embeddings and Language Models

Gomez-Perez J
Denaux R
Garcia-Silva A

Springer International Publishing, (2020), 17-31

DOI: 10.1007/978-3-030-44830-1_3

N/ACitations

10Readers

Get full text

Abstract

Early word embeddings algorithms like word2vec and GloVe generate static distributional representations for words regardless of the context and the sense in which the word is used in a given sentence, offering poor modeling of ambiguous words and lacking coverage for out-of-vocabulary words. Hence a new wave of algorithms based on training language models such as Open AI GPT and BERT has been proposed to generate contextual word embeddings that use as input word constituents allowing them to generate representations for out-of-vocabulary words by combining the word pieces. Recently, fine-tuning pre-trained language models that have been trained on large corpora have constantly advanced the state of the art for many NLP tasks.

Cite

CITATION STYLE

APA

Gomez-Perez, J. M., Denaux, R., & Garcia-Silva, A. (2020). Understanding Word Embeddings and Language Models. In A Practical Guide to Hybrid Natural Language Processing (pp. 17–31). Springer International Publishing. https://doi.org/10.1007/978-3-030-44830-1_3

Understanding Word Embeddings and Language Models

Abstract

Cite

Register to see more suggestions