Introducing two Vietnamese datasets for evaluating semantic models of (Dis-)similarity and relatedness

7Citations
Citations of this article
75Readers
Mendeley users who have this article in their library.

Abstract

We present two novel datasets for the lowresource language Vietnamese to assess models of semantic similarity: ViCon comprises pairs of synonyms and antonyms across word classes, thus offering data to distinguish between similarity and dissimilarity. ViSim-400 provides degrees of similarity across five semantic relations, as rated by human judges. The two datasets are verified through standard co-occurrence and neural network models, showing results comparable to the respective English datasets.

Cite

CITATION STYLE

APA

Nguyen, K. A., Im Walde, S. S., & Vu, N. T. (2018). Introducing two Vietnamese datasets for evaluating semantic models of (Dis-)similarity and relatedness. In NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference (Vol. 2, pp. 199–205). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/n18-2032

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free