Is it possible to create a very large wordnet in 100 days? An evaluation

19Citations
Citations of this article
21Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Wordnets are large-scale lexical databases of related words and concepts, useful for language-aware software applications. They have recently been built for many languages by using various approaches. The Finnish wordnet, FinnWordNet (FiWN), was created by translating the more than 200,000 word senses in the English Princeton WordNet (PWN) 3.0 in 100 days. To ensure quality, they were translated by professional translators. The direct translation approach was based on the assumption that most synsets in PWN represent language-independent real-world concepts. Thus also the semantic relations between synsets were assumed mostly language-independent, so the structure of PWN could be reused as well. This approach allowed the creation of an extensive Finnish wordnet directly aligned with PWN and also provided us with a translation relation and thus a bilingual wordnet usable as a dictionary. In this paper, we address several concerns raised with regard to our approach, many of them for the first time. We evaluate the craftsmanship of the translators by checking the spelling and translation quality, the viability of the approach by assessing the synonym quality both on the lexeme and concept level, as well as the usefulness of the resulting lexical resource both for humans and in a language-technological task. We discovered no new problems compared with those already known in PWN. As a whole, the paper contributes to the scientific discourse on what it takes to create a very large wordnet. As a side-effect of the evaluation, we extended FiWN to contain 208,645 word senses in 120,449 synsets, effectively making version 2.0 of FiWN currently the largest wordnet in the world by these statistics. © 2013 Springer Science+Business Media Dordrecht.

Cite

CITATION STYLE

APA

Lindén, K., & Niemi, J. (2014). Is it possible to create a very large wordnet in 100 days? An evaluation. Language Resources and Evaluation, 48(2), 191–201. https://doi.org/10.1007/s10579-013-9245-0

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free