SubGram: Extending skip-gram word representation with substrings

Tom Kocmi; Ondřej Bojar

Book Chapter

SubGram: Extending skip-gram word representation with substrings

Springer Verlag, (2016), 182-189

DOI: 10.1007/978-3-319-45510-5_21

3Citations

14Readers

Get full text

Abstract

Skip-gram (word2vec) is a recent method for creating vector representations of words (“distributed word representations”) using a neural network. The representation gained popularity in various areas of natural language processing, because it seems to capture syntactic and semantic information about words without any explicit supervision in this respect. We propose SubGram, a refinement of the Skip-gram model to consider also the word structure during the training process, achieving large gains on the Skip-gram original test set.

Author supplied keywords

Cite

CITATION STYLE

APA

Kocmi, T., & Bojar, O. (2016). SubGram: Extending skip-gram word representation with substrings. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9924 LNCS, pp. 182–189). Springer Verlag. https://doi.org/10.1007/978-3-319-45510-5_21

SubGram: Extending skip-gram word representation with substrings

Abstract

Author supplied keywords

Cite

Register to see more suggestions