Information extraction in handwritten marriage licenses books using the MGGI methodology

4Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Historical records of daily activities provide intriguing insights into the life of our ancestors, useful for demographic and genealogical research. For example, marriage license books have been used for centuries by ecclesiastical and secular institutions to register marriages. These books follow a simple structure of the text in the records with a evolutionary vocabulary, mainly composed of proper names that change along the time. This distinct vocabulary makes automatic transcription and semantic information extraction difficult tasks. In previous works we studied the use of category-based language models and how a Grammatical Inference technique known as MGGI could improve the accuracy of these tasks. In this work we analyze the main causes of the semantic errors observed in previous results and apply a better implementation of the MGGI technique to solve these problems. Using the resulting language model, transcription and information extraction experiments have been carried out, and the results support our proposed approach.

Cite

CITATION STYLE

APA

Romero, V., Fornés, A., Vidal, E., & Sánchez, J. A. (2017). Information extraction in handwritten marriage licenses books using the MGGI methodology. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10255 LNCS, pp. 287–294). Springer Verlag. https://doi.org/10.1007/978-3-319-58838-4_32

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free