Using Semi-Supervised Learning andWikipedia to Train an Event Argument Extraction System

Patrik Zajec; Dunja Mladenić

Journal ArticleOPEN ACCESS

Using Semi-Supervised Learning andWikipedia to Train an Event Argument Extraction System

Informatica (Slovenia) (2022) 46(1) 121-128

DOI: 10.31449/inf.v46i1.3577

2Citations

5Readers

Abstract

The paper presents a methodology for training an event argument extraction system in a semi-supervised setting. We use Wikipedia and Wikidata to automatically obtain a small noisily labeled dataset and a large unlabeled dataset. The dataset consists of event clusters containingWikipedia pages in multiple languages. The unlabeled data is iteratively labeled using semi-supervised learning combined with probabilistic soft logic to infer the pseudo-label of each example from the predictions of multiple base learners. The proposed methodology is applied toWikipedia pages about earthquakes and terrorist attacks in a cross-lingual setting. Our experiments show improvement of the results when using the proposed methodology. The system achieves F1-score of 0:79 when only the automatically labeled dataset is used, and F1-score of 0:84 when trained according to the methodology with semi-supervised learning combined with probabilistic soft logic.

Author supplied keywords

Cite

CITATION STYLE

APA

Zajec, P., & Mladenić, D. (2022). Using Semi-Supervised Learning andWikipedia to Train an Event Argument Extraction System. Informatica (Slovenia), 46(1), 121–128. https://doi.org/10.31449/inf.v46i1.3577

Using Semi-Supervised Learning andWikipedia to Train an Event Argument Extraction System

Abstract

Author supplied keywords

Cite

Register to see more suggestions