Various features with integrated strategies for protein name classification

Budi Taruna Ongkowijaya; Shilin Ding; Xiaoyan Zhu

Conference Proceedings

Various features with integrated strategies for protein name classification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2005) 3759 LNCS 213-222

DOI: 10.1007/11576259_24

0Citations

2Readers

Get full text

Abstract

Classification task is an integral part of named entity recognition system to classify a recognized named entity to its corresponding class. This task has not received much attention in the biomedical domain, due to the lack of awareness to differentiate feature sources and strategies in previous studies. In this research, we analyze different sources and strategies of protein name classification, and developed integrated strategies that incorporate advantages from rule-based, dictionary-based and statistical-based method. In rule-based method, terms and knowledge of protein nomenclature that provide strong cue for protein name are used. In dictionary-based method, a set of rules for curating protein name dictionary are used. These terms and dictionaries are combined with our developed features into a statistical-based classifier. Our developed features are comprised of word shape features and unigram & bi-gram features. Our various information sources and integrated strategies are able to achieve state-of-the-art performance to classify protein and non-protein names. © Springer-Verlag Berlin Heidelberg 2005.

Cite

CITATION STYLE

APA

Ongkowijaya, B. T., Ding, S., & Zhu, X. (2005). Various features with integrated strategies for protein name classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3759 LNCS, pp. 213–222). https://doi.org/10.1007/11576259_24

Various features with integrated strategies for protein name classification

Abstract

Cite

Register to see more suggestions