Abstract
This paper presents a general-purpose NLP pipeline for Ancient or early forms of Greek (Classical, Koine, and Medieval) that achieves a slight state-of-art improvement by training on several Universal Dependencies treebanks jointly. We measure the performance of the model against other comparable tools. We show that the selected Greek language models tend not to generalize well to out-of-training set samples. More work is necessary to ensure interoperability between the existing datasets. We identify the main issues and list suggestions for improvements.
Cite
CITATION STYLE
Kostkan, J., Kardos, M., Mortensen, J. P. B., & Nielbo, K. L. (2023). OdyCy - A general-purpose NLP pipeline for Ancient Greek. In EACL 2023 - 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, Proceedings of LaTeCH-CLfL 2023 (pp. 128–134). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.latechclfl-1.14
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.