Active learning with adaptive density weighted sampling for information extraction from scientific papers

3Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The paper addresses the task of information extraction from scientific literature with machine learning methods. In particular, the tasks of definition and result extraction from scientific publications in Russian are considered. We note that annotation of scientific texts for creation of training dataset is very labor insensitive and expensive process. To tackle this problem, we propose methods and tools based on active learning. We describe and evaluate a novel adaptive density-weighted sampling (ADWeS) meta-strategy for active learning. The experiments demonstrate that active learning can be a very efficient technique for scientific text mining, and the proposed meta-strategy can be beneficial for corpus annotation with strongly skewed class distribution. We also investigate informative task-independent features for information extraction from scientific texts and present an openly available tool for corpus annotation, which is equipped with ADWeS and compatible with well-known sampling strategies.

Cite

CITATION STYLE

APA

Suvorov, R., Shelmanov, A., & Smirnov, I. (2018). Active learning with adaptive density weighted sampling for information extraction from scientific papers. In Communications in Computer and Information Science (Vol. 789, pp. 77–90). Springer Verlag. https://doi.org/10.1007/978-3-319-71746-3_7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free