Previous studies on clinical sequence labeling require large amounts of task specific knowledge in the form of handcrafted features. Using latest development in representation learning, this paper introduces BERT embedding as character based pretrained model and incorporates it with three competing deep learning models (CNN-LSTM, Bi-LSTM and Bi-LSTM-CRF) to extract clinical entities from electronic health records. A comparative evaluation based on CCKS-2017 task 2 benchmark dataset reveals that: (1) BERT embedding not only facilitates improving performance of clinical NER tasks but also acts as good candidate for building end-to-end NER model requiring no feature engineering from Chinese EHR. (2) Bi-LSTM-CRF has the highest performance, i.e., 93% F1 scores when it uses BERT embedding. This paper may enhance our understanding of how to use BERT embedding in clinical NER researches.
CITATION STYLE
Wu, J., Shao, D. rui, Guo, J. hang, Cheng, Y., & Huang, G. (2019). Character-Based Deep Learning Approaches for Clinical Named Entity Recognition: A Comparative Study Using Chinese EHR Texts. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11924 LNCS, pp. 311–322). Springer. https://doi.org/10.1007/978-3-030-34482-5_28
Mendeley helps you to discover research relevant for your work.