Robust Word Vectors: Context-Informed Embeddings for Noisy Texts

Valentin Malykh; Varvara Logacheva; Taras Khakhulin

Conference ProceedingsOPEN ACCESS

Robust Word Vectors: Context-Informed Embeddings for Noisy Texts

4th Workshop on Noisy User-Generated Text, W-NUT 2018 - Proceedings of the Workshop (2018) 54-63

DOI: 10.18653/v1/w18-6108

13Citations

31Readers

Abstract

We suggest a new language-independent architecture of robust word vectors (RoVe). It is designed to alleviate the issue of typos, which are common in almost any user-generated content, and hinder automatic text processing. Our model is morphologically motivated, which allows it to deal with unseen word forms in morphologically rich languages. We present the results on a number of Natural Language Processing (NLP) tasks and languages for the variety of related architectures and show that proposed architecture is typo-proof.

Cite

CITATION STYLE

APA

Malykh, V., Logacheva, V., & Khakhulin, T. (2018). Robust Word Vectors: Context-Informed Embeddings for Noisy Texts. In 4th Workshop on Noisy User-Generated Text, W-NUT 2018 - Proceedings of the Workshop (pp. 54–63). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w18-6108

Robust Word Vectors: Context-Informed Embeddings for Noisy Texts

Abstract

Cite

Register to see more suggestions