Statistical analysis of bibliographic strings for constructing an integrated document space

1Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

It is important to utilize retrospective documents when constructing a large digital library. This paper proposes a method for analyzing recognized bibliographic strings using an extended hidden Markov model. The proposed method enables analysis of erroneous bibliographic strings and integrates many documents accumulated as printed articles in a citation index. The proposed method has the advantage of providing a robust bibliographic matching function using the statistical description of the syntax of bibliographic strings, a language model and an Optical Character Recognition (OCR) error model. The method also has the advantage of reducing the cost of preparing training data for parameter estimation, using records in the bibliographic database.

Cite

CITATION STYLE

APA

Takasu, A. (2002). Statistical analysis of bibliographic strings for constructing an integrated document space. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2458, pp. 75–90). Springer Verlag. https://doi.org/10.1007/3-540-45747-x_6

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free