Abstract
Motivation: Disease named entities play a central role in many areas of biomedical research, and automatic recognition and normalization of such entities have received increasing attention in biomedical research communities. Existing methods typically used pipeline models with two independent phases: (i) a disease named entity recognition (DER) system is used to find the boundaries of mentions in text and (ii) a disease named entity normalization (DEN) system is used to connect the mentions recognized to concepts in a controlled vocabulary. The main problems of such models are: (i) there is error propagation from DER to DEN and (ii) DEN is useful for DER, but pipeline models cannot utilize this. Methods: We propose a transition-based model to jointly perform disease named entity recognition and normalization, casting the output construction process into an incremental state transition process, learning sequences of transition actions globally, which correspond to joint structural outputs. Beam search and online structured learning are used, with learning being designed to guide search. Compared with the only existing method for joint DEN and DER, our method allows nonlocal features to be used, which significantly improves the accuracies. Results: We evaluate our model on two corpora: the BioCreative V Chemical Disease Relation (CDR) corpus and the NCBI disease corpus. Experiments show that our joint framework achieves significantly higher performances compared to competitive pipeline baselines. Our method compares favourably to other state-of-the-art approaches.
Cite
CITATION STYLE
Lou, Y., Zhang, Y., Qian, T., Li, F., Xiong, S., & Ji, D. (2017). A transition-based joint model for disease named entity recognition and normalization. Bioinformatics, 33(15), 2363–2371. https://doi.org/10.1093/bioinformatics/btx172
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.