Word2vec’s Distributed Word Representation for Hindi Word Sense Disambiguation

Archana Kumari; D. K. Lobiyal

Conference Proceedings

Word2vec’s Distributed Word Representation for Hindi Word Sense Disambiguation

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 11969 LNCS 325-335

DOI: 10.1007/978-3-030-36987-3_21

10Citations

13Readers

Get full text

Abstract

Word Sense Disambiguation (WSD) is the task of extracting an appropriate sense of an ambiguous word in a sentence. WSD is an essential task for language processing, as it is a pre-requisite for determining the closest interpretations of various language-based applications. In this paper, we have made an attempt to exploit the word embedding for finding the solution for WSD for the Hindi texts. This task involves two steps - the creation of word embedding and leveraging cosine similarity to identify an appropriate sense of the word. In this process, we have considered two mostly used word2vec architectures known as Skip-Gram and Continuous Bag-Of-Words [2] models to develop the word embedding. Further, we have chosen the sense with the closest proximity to identify the meaning of an ambiguous word. To prove the effectiveness of the proposed model, we have performed experiments on large corpora and have achieved an accuracy of nearly 52%.

Author supplied keywords

Cite

CITATION STYLE

APA

Kumari, A., & Lobiyal, D. K. (2020). Word2vec’s Distributed Word Representation for Hindi Word Sense Disambiguation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11969 LNCS, pp. 325–335). Springer. https://doi.org/10.1007/978-3-030-36987-3_21

Word2vec’s Distributed Word Representation for Hindi Word Sense Disambiguation

Abstract

Author supplied keywords

Cite

Register to see more suggestions