A Hybrid Approach and Unified Framework for Bibliographic Reference Extraction

11Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Publications are an integral part of a scientific community. Bibliographic reference extraction from scientific publication is a challenging task due to diversity in referencing styles and document layout. Existing methods perform sufficiently on one dataset however applying these solutions to a different dataset proves to be challenging. Therefore a generic solution was anticipated which could overcome the limitations of the previous approaches. The contribution of this paper is three-fold. First it presents a novel approach called DeepBiRD which is inspired by human visual perception and exploits layout features to identify individual references in a scientific publication. Second we release a large dataset for image-based reference detection with 2401 scans containing 38863 references all manually annotated for individual reference. Third we present a unified and highly configurable end-to-end automatic bibliographic reference extraction framework called BRExSys which employs DeepBiRD along with state-of-the-art text-based models to detect and visualize references from a bibliographic document. Our proposed approach pre-processes the images in which a hybrid representation is obtained by processing the given image using different computer vision techniques. Then it performs layout driven reference detection using Mask R-CNN on a given scientific publication. DeepBiRD was evaluated on two different datasets to demonstrate the generalization of this approach. The proposed system achieved an AP50 of 98.56% on our dataset. DeepBiRD significantly outperformed the current state-of-the-art approach on their dataset. Therefore suggesting that DeepBiRD is significantly superior in performance generalized and independent of any domain or referencing style.

Cite

CITATION STYLE

APA

Rizvi, S. T. R., Dengel, A., & Ahmed, S. (2020). A Hybrid Approach and Unified Framework for Bibliographic Reference Extraction. IEEE Access, 8, 217231–217245. https://doi.org/10.1109/ACCESS.2020.3042455

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free