The question of how to integrate diverse digital repositories into a unified information infrastructure, accessible and discoverable through simple interfaces, remains a central research issue for digital libraries. Many collections are described by specialized metadata, which currently has to be mapped and crosswalked to a standard format in order to be useful. However, this metadata work can be expensive and resource consuming. We describe work-in-progress with DISTIL (Document Indexing & Semantic Tagging Interface for Libraries) to support federated cross-collection search in humanities and the social sciences. DISTIL proposes to support interoperability by generating Dewey Decimal Classification 'tags' from individual metadata records. The resulting tags can then be used to support cross-collection browsing. We focus here on some of the initial pre-processing stages of the metadata workflow, which include cleaning and formatting metadata records, in order to extract terms that can then be used to generate the DDC tags. Some initial strategies for and issues with this workflow are described. © 2012 Springer-Verlag.
CITATION STYLE
Khoo, M., Tudhope, D., Binding, C., Abels, E., Lin, X., & Massam, D. (2012). Towards digital repository interoperability: The Document Indexing and Semantic Tagging Interface for Libraries (DISTIL). In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7489 LNCS, pp. 439–444). https://doi.org/10.1007/978-3-642-33290-6_49
Mendeley helps you to discover research relevant for your work.