Evaluating corpora for named entity recognition using character-level features

Casey Whitelaw; Jon Patrick

Conference Proceedings

Evaluating corpora for named entity recognition using character-level features

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2003) 2903 910-921

DOI: 10.1007/978-3-540-24581-0_78

3Citations

11Readers

Get full text

Abstract

We present a new collection of training corpora for evaluation of language-independent named entity recognition systems.F or the five languages included in this initial release, Basque, Dutch, English, Korean, and Spanish, we provide an analysis of the relative difficulty of the NER task for both the language in general, and as a supervised task using these corpora.W e construct three strongly language-independent systems, each using only orthographic features, and compare their performance on both seen and unseen data.W e achieve improved results through combining these classifiers, showing that ensemble approaches are suitable when dealing with language-independent problems.

Author supplied keywords

Cite

CITATION STYLE

APA

Whitelaw, C., & Patrick, J. (2003). Evaluating corpora for named entity recognition using character-level features. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2903, pp. 910–921). Springer Verlag. https://doi.org/10.1007/978-3-540-24581-0_78

Evaluating corpora for named entity recognition using character-level features

Abstract

Author supplied keywords

Cite

Register to see more suggestions