Incorporating Lexicon for Named Entity Recognition of Traditional Chinese Medicine Books

N/ACitations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Little research has been done on the Named Entity Recognition (NER) of Traditional Chinese Medicine (TCM) books and most of them use statistical models such as Conditional Random Fields (CRFs). However, in these methods, lexicon information and large-scale of unlabeled corpus data are not fully exploited. In order to improve the performance of NER for TCM books, we propose a method which is based on biLSTM-CRF model and can incorporate lexicon information into representation layer to enrich its semantic information. We compared our approach with several previous character-based and word-based methods. Experiments on “Shanghan Lun” dataset show that our method outperforms previous models. In addition, we collected 376 TCM books to construct a large-scale of corpus to obtain the pre-trained vectors since there is no large available corpus in this field before. We have released the corpus and pre-trained vectors to the public.

Cite

CITATION STYLE

APA

Song, B., Bao, Z., Wang, Y. Z., Zhang, W., & Sun, C. (2020). Incorporating Lexicon for Named Entity Recognition of Traditional Chinese Medicine Books. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12431 LNAI, pp. 481–489). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-60457-8_39

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free