Compositional demographic word embeddings

32Citations
Citations of this article
96Readers
Mendeley users who have this article in their library.

Abstract

Word embeddings are usually derived from corpora containing text from many individuals, thus leading to general purpose representations rather than individually personalized representations. While personalized embeddings can be useful to improve language model performance and other language processing tasks, they can only be computed for people with a large amount of longitudinal data, which is not the case for new users. We propose a new form of personalized word embeddings that use demographic-specific word representations derived compositionally from full or partial demographic information for a user (i.e., gender, age, location, religion). We show that the resulting demographic-aware word representations outperform generic word representations on two tasks for English: language modeling and word associations. We further explore the trade-off between the number of available attributes and their relative effectiveness and discuss the ethical implications of using them.

Cite

CITATION STYLE

APA

Welch, C., Kummerfeld, J. K., Pérez-Rosas, V., & Mihalcea, R. (2020). Compositional demographic word embeddings. In EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (pp. 4076–4089). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.emnlp-main.334

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free