An approach for named entity recognition in poorly structured data

Nuno Freire; José Borbinha; Pável Calado

Conference ProceedingsOPEN ACCESS

An approach for named entity recognition in poorly structured data

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7295 LNCS 718-732

DOI: 10.1007/978-3-642-30284-8_55

18Citations

29Readers

Abstract

This paper describes an approach for the task of named entity recognition in structured data containing free text as the values of its elements. We studied the recognition of the entity types of person, location and organization in bibliographic data sets from a concrete wide digital library initiative. Our approach is based on conditional random fields models, using features designed to perform named entity recognition in the absence of strong lexical evidence, and exploiting the semantic context given by the data structure. The evaluation results support that, with the specialized features, named entity recognition can be done in free text within structured data with an acceptable accuracy. Our approach was able to achieve a maximum precision of 0.91 at 0.55 recall and a maximum recall of 0.82 at 0.77 precision. The achieved results were always higher than those obtained with Stanford Named Entity Recognizer, which was developed for grammatically well-formed text. We believe this level of quality in named entity recognition allows the use of this approach to support a wide range of information extraction applications in structured data. © 2012 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Freire, N., Borbinha, J., & Calado, P. (2012). An approach for named entity recognition in poorly structured data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7295 LNCS, pp. 718–732). https://doi.org/10.1007/978-3-642-30284-8_55

An approach for named entity recognition in poorly structured data

Abstract

Author supplied keywords

Cite

Register to see more suggestions