Using embedding models for lexical categorization in morphologically rich languages

Borbála Siklósi

Conference Proceedings

Using embedding models for lexical categorization in morphologically rich languages

Siklósi B

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 9623 LNCS 115-126

DOI: 10.1007/978-3-319-75477-2_7

2Citations

4Readers

Get full text

Abstract

Neural-network-based semantic embedding models are relatively new but popular tools in the field of natural language processing. It has been shown that continuous embedding vectors assigned to words provide an adequate representation of their meaning in the case of English. However, morphologically rich languages have not yet been the subject of experiments with these embedding models. In this paper, we investigate the performance of embedding models for Hungarian, trained on corpora with different levels of preprocessing. The models are evaluated on various lexical categorization tasks. They are used for enriching the lexical database of a morphological analyzer with semantic features automatically extracted from the corpora.

Cite

CITATION STYLE

APA

Siklósi, B. (2018). Using embedding models for lexical categorization in morphologically rich languages. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9623 LNCS, pp. 115–126). Springer Verlag. https://doi.org/10.1007/978-3-319-75477-2_7

Using embedding models for lexical categorization in morphologically rich languages

Abstract

Cite

Register to see more suggestions