A hybrid system for German encyclopedia alignment

2Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Collaboratively created on-line encyclopedias have become increasingly popular. Especially in terms of completeness they have begun to surpass their printed counterparts. Two German publishers of traditional encyclopedias have reacted to this challenge and started an initiative to merge their corpora to create a single, more complete encyclopedia. The crucial step in this merging process is the alignment of articles. We have developed a two-step hybrid system to provide high-accurate alignments with low manual effort. First, we apply an information retrieval based, automatic alignment algorithm. Second, the articles with a low confidence score are revised using a manual alignment scheme carefully designed for quality assurance. Our evaluation shows that a combination of weighting and ranking techniques utilizing different facets of the encyclopedia articles allow to effectively reduce the number of necessary manual alignments. Further, the setup of the manual alignment turned out to be robust against inter-indexer inconsistencies. As a result, the developed system empowered us to align four encyclopedias with high accuracy and low effort. © 2011 Springer-Verlag.

Cite

CITATION STYLE

APA

Kern, R., Seifert, C., & Granitzer, M. (2010). A hybrid system for German encyclopedia alignment. International Journal on Digital Libraries, 11(2), 75–89. https://doi.org/10.1007/s00799-011-0069-5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free