TAGH: A complete morphology for german based on weighted finite state automata

26Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

TAGH is a system for automatic recognition of German word forms. It is based on a stem lexicon with allomorphs and a concatenative mechanism for inflection and word formation. Weighted FSA and a cost function are used in order to determine the correct segmentation of complex forms: the correct segmentation for a given compound is supposed to be the one with the least cost. TAGH is based on a large stem lexicon of almost 80.000 stems that was compiled within 5 years on the basis of large newspaper corpora and literary texts. The number of analyzable word forms is increased considerably by more than 1000 different rules for derivational and compositional word formation. The recognition rate of TAGH is more than 99% for modern newspaper text and approximately 98.5% for literary texts.

Cite

CITATION STYLE

APA

Geyken, A., & Hanneforth, T. (2006). TAGH: A complete morphology for german based on weighted finite state automata. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4002 LNAI, pp. 55–66). Springer Verlag. https://doi.org/10.1007/11780885_7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free