Abstract
Named entity recognition (NER) systems are often realized by supervised methods that require large hand-Annotated data. When the hand-Annotated data is limited, distantly supervised (DS) data and cross-domain (CD) data are usually used separately to improve the performance. The distantly supervised data can provide in-domain dictionary information, and the hand-Annotated cross-domain information can be provided by cross-domain data. These two types of information are complemental. However, there are two problems required to be solved before using directly. First, the distantly supervised data may contain a lot of noise. Second, directly using cross-domain data may degrade performance due to the distribution mismatching problem. In this paper, we propose a unified model named PARE (PArtial learning and REinforcement learning). The PARE model can simultaneously use distantly supervised data and cross-domain data as external data. The model uses the partial learning method with a new label strategy to better handle the noise in distantly supervised data. The reinforcement learning method is used to alleviate the distribution mismatching problem in cross-domain data. Experiments in three datasets show that our model outperforms other baseline models. Besides, our model can be used in the situation where no hand-Annotated in-domain data is provided.
Cite
CITATION STYLE
Hu, Y., He, H., Chen, Z., Zhu, Q., & Zheng, C. (2022). A Unified Model Using Distantly Supervised Data and Cross-Domain Data in NER. Computational Intelligence and Neuroscience, 2022. https://doi.org/10.1155/2022/1987829
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.