We present a survey of tagging accuracies - concerning part-of-speech and full morphological tagging - for several taggers based on a corpus for medieval church Latin (see www.comphistsem.org). The best tagger in our sample, Lapos, has a PoS tagging accuracy of close to 96% and an overall tagging accuracy (including full morphological tagging) of about 85%. When we 'intersect' the taggers with our lexicon, the latter score increases to almost 91% for Lapos. A conservative assessment of lemmatization accuracy on our data estimates a score of 93-94% for a lexicon-based lemmatization strategy and a score of 94-95% for lemmatizing via trained lemmatizers. c 2015 Association for Computational Linguistics and The Asian Federation of Natural Language Processing.
CITATION STYLE
Eger, S., Vor Der Br¨uck, T., & Mehler, A. (2015). Lexicon-assisted tagging and lemmatization in Latin: A comparison of six taggers and two lemmatization methods. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 2015-text, pp. 105–113). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w15-3716
Mendeley helps you to discover research relevant for your work.