An automatic workflow for the formalization of scholarly articles’ structural and semantic elements

Bahar Sateli; René Witte

Conference Proceedings

An automatic workflow for the formalization of scholarly articles’ structural and semantic elements

Communications in Computer and Information Science (2016) 641 309-320

DOI: 10.1007/978-3-319-46565-4_24

4Citations

9Readers

Get full text

Abstract

We present a workflow for the automatic transformation of scholarly literature to a Linked Open Data (LOD) compliant knowledge base to address Task 2 of the Semantic Publishing Challenge 2016. In this year’s task, we aim to extract various contextual information from full-text papers using a text mining pipeline that integrates LOD-based Named Entity Recognition (NER) and triplification of the detected entities. In our proposed approach, we leverage an existing NER tool to ground named entities, such as geographical locations, to their LOD resources. Combined with a rule-based approach, we demonstrate how we can extract both the structural (e.g., floats and sections) and semantic elements (e.g., authors and their respective affiliations) of the provided dataset’s documents. Finally, we integrate the LODeXporter, our flexible exporting module to represent the results as semantic triples in RDF format. As the result, we generate a scalable, TDB-based knowledge base that is interlinked with the LOD cloud, and a public SPARQL endpoint for the task’s queries. Our submission won the second place at the Sem- Pub2016 challenge Task 2 with an average 0.63 F-score.

Cite

CITATION STYLE

APA

Sateli, B., & Witte, R. (2016). An automatic workflow for the formalization of scholarly articles’ structural and semantic elements. In Communications in Computer and Information Science (Vol. 641, pp. 309–320). Springer Verlag. https://doi.org/10.1007/978-3-319-46565-4_24

An automatic workflow for the formalization of scholarly articles’ structural and semantic elements

Abstract

Cite

Register to see more suggestions