A method for building a labeled named entity recognition corpus using ontologies

Ngoc Trinh Vu; Van Hien Tran; Thi Huyen Trang Doan; Hoang Quynh Le; Mai Vu Tran

Conference Proceedings

A method for building a labeled named entity recognition corpus using ontologies

Advances in Intelligent Systems and Computing (2015) 358 141-149

DOI: 10.1007/978-3-319-17996-4_13

1Citations

10Readers

Get full text

Abstract

Building a labeled corpus which contains sufficient data and good coverage along with solving the problems of cost, effort and time is a popular research topic in natural language processing. The problem of constructing automatic or semi-automatic training data has become a matter of the research community. For this reason, we consider the problem of building a corpus in phenotype entity recognition problem, classspecific feature detectors from unlabeled data based on over 10260 unique terms (more than 15000 synonyms) describing human phenotypic features in the Human Phenotype Ontology (HPO) and about 9000 unique terms (about 24000 synonyms) of mouse abnormal phenotype descriptions in the Mammalian Phenotype Ontology. This corpus evaluated on three corpora: Khordad corpus, Phenominer 2012 and Phenominer 2013 corpora with Maximum Entropy and Beam Search method. The performance is good for three corpora, with F-scores of 31.71% and 35.77% for Phenominer 2012 corpus and Phenominer 2013 corpus; 78.36% for Khordad corpus.

Author supplied keywords

Cite

CITATION STYLE

APA

Vu, N. T., Tran, V. H., Doan, T. H. T., Le, H. Q., & Tran, M. V. (2015). A method for building a labeled named entity recognition corpus using ontologies. In Advances in Intelligent Systems and Computing (Vol. 358, pp. 141–149). Springer Verlag. https://doi.org/10.1007/978-3-319-17996-4_13

A method for building a labeled named entity recognition corpus using ontologies

Abstract

Author supplied keywords

Cite

Register to see more suggestions