Abstract
We suggest a new language-independent architecture of robust word vectors (RoVe). It is designed to alleviate the issue of typos and misspellings, common in almost any user-generated content, which hinder automatic text processing. Our model is morphologically motivated, which allows it to deal with unseen word forms in morphologically rich languages. We present the results on a number of natural language processing (NLP) tasks and languages for a variety of related architectures and show that the proposed architecture is robust to typos.
Cite
CITATION STYLE
Malykh, V., Khakhulin, T., & Logacheva, V. (2023). Robust Word Vectors: Context-Informed Embeddings for Noisy Texts. Journal of Mathematical Sciences (United States), 273(4), 614–627. https://doi.org/10.1007/s10958-023-06523-w
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.