LOPDF: A framework for extracting and producing open data of scientific documents for smart digital libraries

3Citations
Citations of this article
25Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Background. Results of scientific experiments and research work, either conducted by individuals or organizations, are published and shared with scientific community in different types of scientific publications such as books, chapters, journals, articles, reference works and reference works entries. One aspect of these documents is their contents and the other is metadata. Metadata of scientific documents could be used to increase mutual cooperation, find people with common interest and research work, and to find scientific documents in the matching domains. The major issue in getting these benefits from metadata of scientific publications is availability of these data in unstructured (or semi-structured) format so that it can not be used to ask smart queries that can help in computing and performing different types of analysis on scientific publications data. Also, acquisition and smart processing of publications data is a complicated as well as time and resource consuming task. Methods. To address this problem we have developed a generic framework named as Linked Open Publications Data Framework (LOPDF). The LOPDF framework can be used to crawl, process, extract and produce machine understandable data (i.e., LOD) about scientific publications from different publisher specific sources such as portals, XML export and websites. In this paper we present the architecture, process and algorithm that we developed to process textual publications data and to produce semantically enriched data as RDF datasets (i.e., open data). Results. The resulting datasets can be used to make smart queries by making use of SPARQL protocol. We also present the quantitative as well as qualitative analysis of our resulting datasets which ultimately can be used to compute the research behavior of organizations in rapidly growing knowledge society. Finally, we present the potential usage of producing and processing such open data of scientific publications and how results of performing smart queries on resulting open datasets can be used to compute the impact and perform different types of analysis on scientific publications data.

References Powered by Scopus

Linked data - The story so far

3383Citations
2823Readers
Get full text

Exhibit: Lightweight structured data publishing

150Citations
91Readers
Get full text

Knowledge extraction from structured sources

31Citations
38Readers
Get full text

Cited by Powered by Scopus

Get full text

Research and Implementation of PDF Specific Element Fast Extraction

0Citations
9Readers
Get full text

Integrating Data-Oriented Intelligent Evaluation Framework Based Image Detection System

0Citations
1Readers
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Aslam, M. A. (2021). LOPDF: A framework for extracting and producing open data of scientific documents for smart digital libraries. PeerJ Computer Science, 7, 1–23. https://doi.org/10.7717/PEERJ-CS.445

Readers over time

‘21‘22‘23‘24036912

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 4

50%

Lecturer / Post doc 2

25%

Professor / Associate Prof. 1

13%

Researcher 1

13%

Readers' Discipline

Tooltip

Computer Science 5

56%

Social Sciences 2

22%

Pharmacology, Toxicology and Pharmaceut... 1

11%

Engineering 1

11%

Save time finding and organizing research with Mendeley

Sign up for free
0