Beyond context: A new perspective for word embeddings

1Citations
Citations of this article
64Readers
Mendeley users who have this article in their library.

Abstract

Most word embeddings today are trained by optimizing a language modeling goal of scoring words in their context, modeled as a multiclass classification problem. Despite the successes of this assumption, it is incomplete: in addition to its context, orthographical or morphological aspects of words can offer clues about their meaning. In this paper, we define a new modeling framework for training word embeddings that captures this intuition. Our framework is based on the well-studied problem of multi-label classification and, consequently, exposes several design choices for featurizing words and contexts, loss functions for training and score normalization. Indeed, standard models such as CBOW and FAST-TEXT are specific choices along each of these axes. We show via experiments that by combining feature engineering with embedding learning, our method can outperform CBOW using only 10% of the training data in both the standard word embedding evaluations and also text classification experiments.

Cite

CITATION STYLE

APA

Zhou, Y., & Srikumar, V. (2019). Beyond context: A new perspective for word embeddings. In *SEM@NAACL-HLT 2019 - 8th Joint Conference on Lexical and Computational Semantics (pp. 22–32). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/s19-1003

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free