Word mover's embedding: From word2vec to document embedding

65Citations
Citations of this article
324Readers
Mendeley users who have this article in their library.

Abstract

While the celebrated Word2Vec technique yields semantically rich representations for individual words, there has been relatively less success in extending to generate unsupervised sentences or documents embeddings. Recent work has demonstrated that a distance measure between documents called Word Mover's Distance (WMD) that aligns semantically similar words, yields unprecedented KNN classification accuracy. However, WMD is expensive to compute, and it is hard to extend its use beyond a KNN classifier. In this paper, we propose the Word Mover's Embedding (WME), a novel approach to building an unsupervised document (sentence) embedding from pre-trained word embeddings. In our experiments on 9 benchmark text classification datasets and 22 textual similarity tasks, the proposed technique consistently matches or outperforms state-of-the-art techniques, with significantly higher accuracy on problems of short length.

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Wu, L., Yen, I. E. H., Xu, K., Xu, F., Balakrishnan, A., Chen, P. Y., … Witbrock, M. J. (2018). Word mover’s embedding: From word2vec to document embedding. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 (pp. 4524–4534). Association for Computational Linguistics. https://doi.org/10.18653/v1/d18-1482

Readers over time

‘18‘19‘20‘21‘22‘23‘24‘250306090120

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 120

66%

Researcher 43

23%

Professor / Associate Prof. 11

6%

Lecturer / Post doc 9

5%

Readers' Discipline

Tooltip

Computer Science 163

82%

Engineering 21

11%

Business, Management and Accounting 9

5%

Linguistics 6

3%

Save time finding and organizing research with Mendeley

Sign up for free
0