Scientific literature is increasingly becoming available on the World Wide Web. This paper considers the matching of citations found in different papers in order to autonomously construct a citation index from papers in electronic format. Citation indices of scientific literature have traditionally been constructed manually, partly because it can be difficult to autonomously determine if two citations refer to the same paper (citations can be written in many different formats). We present four algorithms for autonomous citation matching. The algorithms are based on edit-distance computation, word matching, word and phrase matching, and subfield extraction. The word and phrase matching algorithm obtains the lowest error rate, and the subfield algorithm is the most computationally efficient. We quantitatively compare the accuracy and efficiency of the algorithms on a number of datasets.
CITATION STYLE
Lawrence, S., Giles, C. L., & Bollacker, K. D. (1999). Autonomous citation matching. In Proceedings of the International Conference on Autonomous Agents (pp. 392–393). ACM. https://doi.org/10.1145/301136.301255
Mendeley helps you to discover research relevant for your work.