ECNUICA at SemEval-2021 Task 11: Rule based Information Extraction Pipeline

2Citations
Citations of this article
46Readers
Mendeley users who have this article in their library.

Abstract

This paper presents our endeavor for solving task11, NLPContributionGraph, of SemEval-2021. The purpose of the task is to extract triples from a paper in the Nature Language Processing field for constructing an Open Research Knowledge Graph. The task includes three sub-tasks: detecting the contribution sentences in papers, identifying scientific terms and predicate phrases from the contribution sentences; and inferring triples in the form of (subject, predicate, object) as statements for Knowledge Graph building. In this paper, we apply an ensemble of various fine-tuned pretrained language models (PLM) for tasks one and two. In addition, the self-training methods are adopted for tackling the shortage of annotated data. For the third task, rather than using classic neural open information extraction (OIE) architectures, we generate potential triples via manually designed rules and develop a binary classifier to differentiate positive ones from others. The quantitative results show that we obtain the 4th, 2nd, and 2nd rank in three evaluation phases.

Cite

CITATION STYLE

APA

Lin, J., Ling, J., Wang, Z., Liu, J., Chen, Q., & He, L. (2021). ECNUICA at SemEval-2021 Task 11: Rule based Information Extraction Pipeline. In SemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop (pp. 1295–1302). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.semeval-1.185

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free