The role of software in science: a knowledge graph-based analysis of software mentions in PubMed Central

David Schindler; Felix Bensmann; Stefan Dietze; Frank Krüger

Journal ArticleOPEN ACCESS

The role of software in science: a knowledge graph-based analysis of software mentions in PubMed Central

PeerJ Computer Science (2022) 8

DOI: 10.7717/PEERJ-CS.835

31Citations

25Readers

Get full text

Abstract

Science across all disciplines has become increasingly data-driven, leading to additional needs with respect to software for collecting, processing and analysingdata. Thus, transparency about software used as part of the scientific process iscrucial to understand provenance of individual research data and insights, is aprerequisite for reproducibility and can enable macro-analysis of the evolution ofscientific methods over time. However, missing rigor in software citation practicesrenders the automated detection and disambiguation of software mentions achallenging problem. In this work, we provide a large-scale analysis of software usageand citation practices facilitated through an unprecedented knowledge graph ofsoftware mentions and affiliated metadata generated through supervised informationextraction models trained on a unique gold standard corpus and applied to more than3 million scientific articles. Our information extraction approach distinguishesdifferent types of software and mentions, disambiguates mentions and outperformsthe state-of-the-art significantly, leading to the most comprehensive corpus of 11.8Msoftware mentions that are described through a knowledge graph consisting of morethan 300 M triples. Our analysis provides insights into the evolution of softwareusage and citation patterns across various fields, ranks of journals, and impact ofpublications. Whereas, to the best of our knowledge, this is the most comprehensiveanalysis of software use and citation at the time, all data and models are sharedpublicly to facilitate further research into scientific use and citation of software

Author supplied keywords

Cite

CITATION STYLE

APA

Schindler, D., Bensmann, F., Dietze, S., & Krüger, F. (2022). The role of software in science: a knowledge graph-based analysis of software mentions in PubMed Central. PeerJ Computer Science, 8. https://doi.org/10.7717/PEERJ-CS.835

The role of software in science: a knowledge graph-based analysis of software mentions in PubMed Central

Abstract

Author supplied keywords

Cite

Register to see more suggestions