Concept Extraction Using Pointer–Generator Networks and Distant Supervision for Data Augmentation

6Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Concept extraction is crucial for a number of downstream applications. However, surprisingly enough, straightforward single token/nominal chunk–concept alignment or dictionary lookup techniques such as DBpedia Spotlight still prevail. We propose a generic open domain-oriented extractive model that is based on distant supervision of a pointer–generator network leveraging bidirectional LSTMs and a copy mechanism and that is able to cope with the out-of-vocabulary phenomenon. The model has been trained on a large annotated corpus compiled specifically for this task from 250K Wikipedia pages, and tested on regular pages, where the pointers to other pages are considered as ground truth concepts. The outcome of the experiments shows that our model significantly outperforms standard techniques and, when used on top of DBpedia Spotlight, further improves its performance. The experiments furthermore show that the model can be readily ported to other datasets on which it equally achieves a state-of-the-art performance.

Cite

CITATION STYLE

APA

Shvets, A., & Wanner, L. (2020). Concept Extraction Using Pointer–Generator Networks and Distant Supervision for Data Augmentation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12387 LNAI, pp. 120–135). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-61244-3_8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free