We suggest a new language-independent architecture of robust word vectors (RoVe). It is designed to alleviate the issue of typos, which are common in almost any user-generated content, and hinder automatic text processing. Our model is morphologically motivated, which allows it to deal with unseen word forms in morphologically rich languages. We present the results on a number of Natural Language Processing (NLP) tasks and languages for the variety of related architectures and show that proposed architecture is typo-proof.
CITATION STYLE
Malykh, V., Logacheva, V., & Khakhulin, T. (2018). Robust Word Vectors: Context-Informed Embeddings for Noisy Texts. In 4th Workshop on Noisy User-Generated Text, W-NUT 2018 - Proceedings of the Workshop (pp. 54–63). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w18-6108
Mendeley helps you to discover research relevant for your work.