Czech named entity corpus and SVM-based recognizer

27Citations
Citations of this article
91Readers
Mendeley users who have this article in their library.

Abstract

This paper deals with recognition of named entities in Czech texts. We present a recently released corpus of Czech sentences with manually annotated named entities, in which a rich two-level classification scheme was used. There are around 6000 sentences in the corpus with roughly 33000 marked named entity instances. We use the data for training and evaluating a named entity recognizer based on Support Vector Machine classification technique. The presented recognizer outperforms the results previously reported for NE recognition in Czech.

Cite

CITATION STYLE

APA

Kravalová, J., & Žabokrtský, Z. (2009). Czech named entity corpus and SVM-based recognizer. In NEWS 2009 - 2009 Named Entities Workshop: Shared Task on Transliteration at the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, ACL-IJCNLP 2009 (pp. 194–201). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1699705.1699748

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free