Lexicon-assisted tagging and lemmatization in Latin: A comparison of six taggers and two lemmatization methods

17Citations
Citations of this article
79Readers
Mendeley users who have this article in their library.

Abstract

We present a survey of tagging accuracies - concerning part-of-speech and full morphological tagging - for several taggers based on a corpus for medieval church Latin (see www.comphistsem.org). The best tagger in our sample, Lapos, has a PoS tagging accuracy of close to 96% and an overall tagging accuracy (including full morphological tagging) of about 85%. When we 'intersect' the taggers with our lexicon, the latter score increases to almost 91% for Lapos. A conservative assessment of lemmatization accuracy on our data estimates a score of 93-94% for a lexicon-based lemmatization strategy and a score of 94-95% for lemmatizing via trained lemmatizers. c 2015 Association for Computational Linguistics and The Asian Federation of Natural Language Processing.

References Powered by Scopus

Comparisons of sequence labeling algorithms and extensions

126Citations
N/AReaders
Get full text

The annals of humanities computing: The index Thomisticus

105Citations
N/AReaders
Get full text

The Perseus Project: A digital library for the humanities

52Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Non-literal text reuse in historical texts: An approach to identify reuse transformations and its application to bible reuse

12Citations
N/AReaders
Get full text

Building a text analysis pipeline for classical languages

11Citations
N/AReaders
Get full text

Spam Detection Over Call Transcript Using Deep Learning

3Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Eger, S., Vor Der Br¨uck, T., & Mehler, A. (2015). Lexicon-assisted tagging and lemmatization in Latin: A comparison of six taggers and two lemmatization methods. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 2015-text, pp. 105–113). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w15-3716

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 24

63%

Researcher 9

24%

Lecturer / Post doc 3

8%

Professor / Associate Prof. 2

5%

Readers' Discipline

Tooltip

Computer Science 28

67%

Linguistics 8

19%

Social Sciences 3

7%

Arts and Humanities 3

7%

Save time finding and organizing research with Mendeley

Sign up for free