Bert-based named entity recognition in chinese twenty-four histories

12Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Named entity recognition in classical Chinese plays a fundamental role in improving the ability of information extraction and constructing knowledge graphs from classical Chinese. However, due to the lack of annotated data and the complexity of grammatical rules, named entity recognition in classical Chinese has made little progress. In order to solve the problem of lack of labeled data, we propose an end-to-end solution that is not based on domain knowledge, which instead is based on the pre-trained BERT-Chinese model and integrates the BiLSTM-CRF model for classical Chinese named entity recognition. The BERT-Chinese model converts the input text into a character-level embedding vector, then the BiLSTM model is used for future training, and finally CRF is able to normalize the output of BiLSTM to obtain a globally optimal labeling sequence. We conducted fine-tuning training on ChineseDailyNerCorpus. By designing a optimized fine-tuning method, we have realized the named entity recognition task in the Chinese Twenty-Four Histories. We evaluated our model on the Chinese twenty-four histories data, and finally achieved an average F1 value of about 75%. The experimental results also show that the BERT model has a strong ability in transfer learning.

Cite

CITATION STYLE

APA

Yu, P., & Wang, X. (2020). Bert-based named entity recognition in chinese twenty-four histories. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12432 LNCS, pp. 289–301). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-60029-7_27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free