From Word Types to Tokens and Back: A Survey of Approaches to Word Meaning Representation and Interpretation

Marianna Apidianaki

Journal ArticleOPEN ACCESS

From Word Types to Tokens and Back: A Survey of Approaches to Word Meaning Representation and Interpretation

Apidianaki M

Computational Linguistics (2023) 49(2) 465-523

DOI: 10.1162/coli_a_00474

12Citations

29Readers

Abstract

Vector-based word representation paradigms situate lexical meaning at different levels of abstraction. Distributional and static embedding models generate a single vector per word type, which is an aggregate across the instances of the word in a corpus. Contextual language models, on the contrary, directly capture the meaning of individual word instances. The goal of this survey is to provide an overview of word meaning representation methods, and of the strategies that have been proposed for improving the quality of the generated vectors. These often involve injecting external knowledge about lexical semantic relationships, or refining the vectors to describe different senses. The survey also covers recent approaches for obtaining word type-level representations from token-level ones, and for combining static and contextualized representations. Special focus is given to probing and interpretation studies aimed at discovering the lexical semantic knowledge that is encoded in contextualized representations. The challenges posed by this exploration have motivated the interest towards static embedding derivation from contextualized embeddings, and for methods aimed at improving the similarity estimates that can be drawn from the space of contextual language models.

Cite

CITATION STYLE

APA

Apidianaki, M. (2023). From Word Types to Tokens and Back: A Survey of Approaches to Word Meaning Representation and Interpretation. Computational Linguistics, 49(2), 465–523. https://doi.org/10.1162/coli_a_00474

From Word Types to Tokens and Back: A Survey of Approaches to Word Meaning Representation and Interpretation

Abstract

Cite

Register to see more suggestions