Lexicon-assisted tagging and lemmatization in Latin: A comparison of six taggers and two lemmatization methods

16Citations
Citations of this article
75Readers
Mendeley users who have this article in their library.

Abstract

We present a survey of tagging accuracies - concerning part-of-speech and full morphological tagging - for several taggers based on a corpus for medieval church Latin (see www.comphistsem.org). The best tagger in our sample, Lapos, has a PoS tagging accuracy of close to 96% and an overall tagging accuracy (including full morphological tagging) of about 85%. When we 'intersect' the taggers with our lexicon, the latter score increases to almost 91% for Lapos. A conservative assessment of lemmatization accuracy on our data estimates a score of 93-94% for a lexicon-based lemmatization strategy and a score of 94-95% for lemmatizing via trained lemmatizers. c 2015 Association for Computational Linguistics and The Asian Federation of Natural Language Processing.

Cite

CITATION STYLE

APA

Eger, S., Vor Der Br¨uck, T., & Mehler, A. (2015). Lexicon-assisted tagging and lemmatization in Latin: A comparison of six taggers and two lemmatization methods. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 2015-text, pp. 105–113). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w15-3716

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free