Active learning with adaptive density weighted sampling for information extraction from scientific papers

Roman Suvorov; Artem Shelmanov; Ivan Smirnov

Conference Proceedings

Active learning with adaptive density weighted sampling for information extraction from scientific papers

Communications in Computer and Information Science (2018) 789 77-90

DOI: 10.1007/978-3-319-71746-3_7

3Citations

12Readers

Get full text

Abstract

The paper addresses the task of information extraction from scientific literature with machine learning methods. In particular, the tasks of definition and result extraction from scientific publications in Russian are considered. We note that annotation of scientific texts for creation of training dataset is very labor insensitive and expensive process. To tackle this problem, we propose methods and tools based on active learning. We describe and evaluate a novel adaptive density-weighted sampling (ADWeS) meta-strategy for active learning. The experiments demonstrate that active learning can be a very efficient technique for scientific text mining, and the proposed meta-strategy can be beneficial for corpus annotation with strongly skewed class distribution. We also investigate informative task-independent features for information extraction from scientific texts and present an openly available tool for corpus annotation, which is equipped with ADWeS and compatible with well-known sampling strategies.

Author supplied keywords

Cite

CITATION STYLE

APA

Suvorov, R., Shelmanov, A., & Smirnov, I. (2018). Active learning with adaptive density weighted sampling for information extraction from scientific papers. In Communications in Computer and Information Science (Vol. 789, pp. 77–90). Springer Verlag. https://doi.org/10.1007/978-3-319-71746-3_7

Active learning with adaptive density weighted sampling for information extraction from scientific papers

Abstract

Author supplied keywords

Cite

Register to see more suggestions