A new approach towards bibliographic reference identification, parsing and inline citation matching

Deepank Gupta; Bob Morris; Terry Catapano; Guido Sautter

Conference Proceedings

A new approach towards bibliographic reference identification, parsing and inline citation matching

Communications in Computer and Information Science (2009) 40 93-102

DOI: 10.1007/978-3-642-03547-0_10

9Citations

34Readers

Get full text

Abstract

A number of algorithms and approaches have been proposed towards the problem of scanning and digitizing research papers. We can classify work done in the past into three major approaches: regular expression based heuristics, learning based algorithm and knowledge based systems. Our findings point to the inadequacy of existing open-source solutions such as Paracite for papers with "micro-citations" in various European Languages. This paper describes the work done as part of the Google Summer of Code 2008 using a combination of regular-expression based heuristics and knowledge-based systems to develop a system which matches inline citations to their corresponding bibliographic references and identifies and extracts metadata from references. The description, implementation and results of our approach have been presented here. Our approach enhances the accuracy and provides better recognition rates. © 2009 Springer Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Gupta, D., Morris, B., Catapano, T., & Sautter, G. (2009). A new approach towards bibliographic reference identification, parsing and inline citation matching. In Communications in Computer and Information Science (Vol. 40, pp. 93–102). https://doi.org/10.1007/978-3-642-03547-0_10

A new approach towards bibliographic reference identification, parsing and inline citation matching

Abstract

Author supplied keywords

Cite

Register to see more suggestions