Identifying references to datasets in publications

Katarina Boland; Dominique Ritze; Kai Eckert; Brigitte Mathiak

Conference Proceedings

Identifying references to datasets in publications

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7489 LNCS 150-161

DOI: 10.1007/978-3-642-33290-6_17

29Citations

25Readers

Get full text

Abstract

Research data and publications are usually stored in separate and structurally distinct information systems. Often, links between these resources are not explicitly available which complicates the search for previous research. In this paper, we propose a pattern induction method for the detection of study references in full texts. Since these references are not specified in a standardized way and may occur inside a variety of different contexts - i.e., captions, footnotes, or continuous text - our algorithm is required to induce very flexible patterns. To overcome the sparse distribution of training instances, we induce patterns iteratively using a bootstrapping approach. We show that our method achieves promising results for the automatic identification of data references and is a first step towards building an integrated information system. © 2012 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Boland, K., Ritze, D., Eckert, K., & Mathiak, B. (2012). Identifying references to datasets in publications. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7489 LNCS, pp. 150–161). https://doi.org/10.1007/978-3-642-33290-6_17

Identifying references to datasets in publications

Abstract

Author supplied keywords

Cite

Register to see more suggestions