Guessers for finite-state transducer lexicons

Krister Lindén

Conference Proceedings

Guessers for finite-state transducer lexicons

Lindén K

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2009) 5449 LNCS 158-169

DOI: 10.1007/978-3-642-00382-0_13

3Citations

10Readers

Get full text

Abstract

Language software applications encounter new words, e.g., acronyms, technical terminology, names or compounds of such words. In order to add new words to a lexicon, we need to indicate their inflectional paradigm. We present a new generally applicable method for creating an entry generator, i.e. a paradigm guesser, for finite-state transducer lexicons. As a guesser tends to produce numerous suggestions, it is important that the correct suggestions be among the first few candidates. We prove some formal properties of the method and evaluate it on Finnish, English and Swedish full-scale transducer lexicons. We use the open-source Helsinki Finite-State Technology [1] to create finitestate transducer lexicons from existing lexical resources and automatically derive guessers for unknown words. The method has a recall of 82-87 % and a precision of 71-76 % for the three test languages. The model needs no external corpus and can therefore serve as a baseline. © Springer-Verlag Berlin Heidelberg 2009.

Cite

CITATION STYLE

APA

Lindén, K. (2009). Guessers for finite-state transducer lexicons. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5449 LNCS, pp. 158–169). https://doi.org/10.1007/978-3-642-00382-0_13

Guessers for finite-state transducer lexicons

Abstract

Cite

Register to see more suggestions