Named entity recognition in Wikipedia

Dominic Balasuriya; Nicky Ringland; Joel Nothman; Tara Murphy; James R. Curran

Conference ProceedingsOPEN ACCESS

Named entity recognition in Wikipedia

People's Web 2009 - 2009 Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources at the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, ACL-IJCNLP 2009 - Proceedings (2009) 10-18

DOI: 10.3115/1699765.1699767

99Citations

180Readers

Abstract

Named entity recognition (NER) is used in many domains beyond the newswire text that comprises current gold-standard corpora. Recent work has used Wikipedia's link structure to automatically generate near gold-standard annotations. Until now, these resources have only been evaluated on newswire corpora or themselves. We present the first NER evaluation on a Wikipedia gold standard (WG) corpus. Our analysis of cross-corpus performance on WG shows that Wikipedia text may be a harder NER domain than newswire. We find that an automatic annotation of Wikipedia has high agreement with WG and, when used as training data, outperforms newswire models by up to 7.7%.

Cite

CITATION STYLE

APA

Balasuriya, D., Ringland, N., Nothman, J., Murphy, T., & Curran, J. R. (2009). Named entity recognition in Wikipedia. In People’s Web 2009 - 2009 Workshop on The People’s Web Meets NLP: Collaboratively Constructed Semantic Resources at the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, ACL-IJCNLP 2009 - Proceedings (pp. 10–18). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1699765.1699767

Named entity recognition in Wikipedia

Abstract

Cite

Register to see more suggestions