A Unified Model Using Distantly Supervised Data and Cross-Domain Data in NER

4Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Named entity recognition (NER) systems are often realized by supervised methods that require large hand-Annotated data. When the hand-Annotated data is limited, distantly supervised (DS) data and cross-domain (CD) data are usually used separately to improve the performance. The distantly supervised data can provide in-domain dictionary information, and the hand-Annotated cross-domain information can be provided by cross-domain data. These two types of information are complemental. However, there are two problems required to be solved before using directly. First, the distantly supervised data may contain a lot of noise. Second, directly using cross-domain data may degrade performance due to the distribution mismatching problem. In this paper, we propose a unified model named PARE (PArtial learning and REinforcement learning). The PARE model can simultaneously use distantly supervised data and cross-domain data as external data. The model uses the partial learning method with a new label strategy to better handle the noise in distantly supervised data. The reinforcement learning method is used to alleviate the distribution mismatching problem in cross-domain data. Experiments in three datasets show that our model outperforms other baseline models. Besides, our model can be used in the situation where no hand-Annotated in-domain data is provided.

Cite

CITATION STYLE

APA

Hu, Y., He, H., Chen, Z., Zhu, Q., & Zheng, C. (2022). A Unified Model Using Distantly Supervised Data and Cross-Domain Data in NER. Computational Intelligence and Neuroscience, 2022. https://doi.org/10.1155/2022/1987829

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free