A Mixed Semantic Features Model for Chinese NER with Characters and Words

3Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Named Entity Recognition (NER) is an essential part of many natural language processing (NLP) tasks. The existing Chinese NER methods are mostly based on word segmentation, or use the character sequences as input. However, using a single granularity representation would suffer from the problems of out-of-vocabulary and word segmentation errors, and the semantic content is relatively simple. In this paper, we introduce the self-attention mechanism into the BiLSTM-CRF neural network structure for Chinese named entity recognition with two embedding. Different from other models, our method combines character and word features at the sequence level, and the attention mechanism computes similarity on the total sequence consisted of characters and words. The character semantic information and the structure of words work together to improve the accuracy of word boundary segmentation and solve the problem of long-phrase combination. We validate our model on MSRA and Weibo corpora, and experiments demonstrate that our model can significantly improve the performance of the Chinese NER task.

Cite

CITATION STYLE

APA

Chang, N., Zhong, J., Li, Q., & Zhu, J. (2020). A Mixed Semantic Features Model for Chinese NER with Characters and Words. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12035 LNCS, pp. 356–368). Springer. https://doi.org/10.1007/978-3-030-45439-5_24

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free