Utilizing vector models for automatic text lemmatization

6Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper we tackle the problem of lemmatization of inflectional languages. We introduce a new algorithm which utilizes vector models of words. Current approaches in this area are limited to knowing either full grammar rules or the translation matrix between the word and its basic form. However, this information is encoded in natural text. Our solution uses text corpora to build vector models of words and a small amount of user input to infer lemmas. We have evaluated our approach on the Slovak language and present interesting findings on its feasibility for real-world utilization.

Cite

CITATION STYLE

APA

Gallay, L., & Šimko, M. (2016). Utilizing vector models for automatic text lemmatization. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9587, pp. 532–543). Springer Verlag. https://doi.org/10.1007/978-3-662-49192-8_43

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free