Applying Model Fusion to Augment Data for Entity Recognition in Legal Documents

N/ACitations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Named entity recognition for legal documents is a basic and crucial task, which can provide important knowledge for the related tasks in the field of wisdom justice. However, it is still difficult to augment the labeled data of named entities for legal documents automatically. To address this issue, we propose a novel data augmentation method for named entity recognition by fusing multiple models. Firstly, we train a total of ten models by conducting 5-fold cross-training on the small-scale labeled datasets based on Bilstm-CRF and Bert-Bilstm-CRF models separately. Next, we try to apply single-model fusion and multi-model fusion modes, in which, single-model fusion is to vote on the prediction results of five models of the same baseline, while multi-model fusion is to vote on the prediction results of ten models with two different baselines. Further, we take the identified entities with high correctness in the multiple experimental results as effective entities, and add them to the training set for the next training. Finally, we conduct the different experiments on two public datasets and our built judicial dataset separately, which shows the experimental results using data augmentation are close to those based on 5 times of labeled dataset, and obviously better than those on the initial small-scale labeled datasets.

Cite

CITATION STYLE

APA

Zhang, H., Gao, H., Zhou, J., & Li, R. (2020). Applying Model Fusion to Augment Data for Entity Recognition in Legal Documents. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12430 LNAI, pp. 244–255). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-60450-9_20

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free