Czech named entity corpus and SVM-based recognizer

Jana Kravalová; Zdeněk Žabokrtský

Conference ProceedingsOPEN ACCESS

Czech named entity corpus and SVM-based recognizer

NEWS 2009 - 2009 Named Entities Workshop: Shared Task on Transliteration at the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, ACL-IJCNLP 2009 (2009) 194-201

DOI: 10.3115/1699705.1699748

27Citations

91Readers

Abstract

This paper deals with recognition of named entities in Czech texts. We present a recently released corpus of Czech sentences with manually annotated named entities, in which a rich two-level classification scheme was used. There are around 6000 sentences in the corpus with roughly 33000 marked named entity instances. We use the data for training and evaluating a named entity recognizer based on Support Vector Machine classification technique. The presented recognizer outperforms the results previously reported for NE recognition in Czech.

Cite

CITATION STYLE

APA

Kravalová, J., & Žabokrtský, Z. (2009). Czech named entity corpus and SVM-based recognizer. In NEWS 2009 - 2009 Named Entities Workshop: Shared Task on Transliteration at the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, ACL-IJCNLP 2009 (pp. 194–201). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1699705.1699748

Czech named entity corpus and SVM-based recognizer

Abstract

Cite

Register to see more suggestions