Danish is a major Scandinavian language spoken daily by around six million people. However, it lacks a unified, open set of NLP tools. This demonstration will introduce DKIE, an extensible open-source toolkit for processing Danish text. We implement an information extraction architecture for Danish within GATE, including integrated third-party tools. This implementation includes the creation of a substantial set of corpus annotations for data-intensive named entity recognition. The final application and dataset is made are openly available, and the part-of-speech tagger and NER model also operate independently or with the Stanford NLP toolkit.
CITATION STYLE
Derczynski, L., Field, C. V., & Bøgh, K. S. (2014). DKIE: Open Source Information Extraction for Danish. In EACL 2014 - Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics (pp. 61–64). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/e14-2016
Mendeley helps you to discover research relevant for your work.