Abstract
In this work, we present a Serbian literary corpus that is being developed under the umbrella of the “Distant Reading for European Literary History” COST Action CA16204. Using this corpus of novels written more than a century ago, we have developed and made publicly available a Named Entity Recognizer (NER) trained to recognize 7 different named entity types, with a Convolutional Neural Network (CNN) architecture, having F1 score of ˜91% on the test dataset. This model has been further assessed on a separate evaluation dataset. We wrap up with comparison of the developed model with the existing one, followed by a discussion of pros and cons of the both models.
Cite
CITATION STYLE
Todorovic, B. Š., Krstev, C., Stankovic, R., & Nešic, M. I. (2021). Serbian NER&Beyond: The Archaic and the Modern Intertwinned. In International Conference Recent Advances in Natural Language Processing, RANLP (pp. 1252–1260). Incoma Ltd. https://doi.org/10.26615/978-954-452-072-4_141
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.