Author gender metadata augmentation of hathitrust digital library

7Citations
Citations of this article
32Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Bibliographic metadata is essential for digital library resource description. Especially as the size and number of bibliographic entities grows, high-quality metadata enables richer forms of digital library access, search, and use. Metadata records can be enriched through automated techniques. For example, a digital humanities scholar might use the gender of a set of authors during their literature analysis. In this study, we undertook to enrich the metadata description of a large-scale digital library, the HathiTrust (HT) digital library, specifically by determining the gender of authors of the public domain portion of the collection. The results are stored to a separate Solr index accessible through the HathiTrust Research Center services. This study, which successfully resolved in 78.9% of the cases the gender of authors in the HT public domain corpus, suggests future research directions in capturing and representing the provenance of the contributing sources to enhance trust, and in machine learning to resolve the remaining names.

Cite

CITATION STYLE

APA

Peng, Z., Chen, M., Kowalczyk, S., & Plale, B. (2014). Author gender metadata augmentation of hathitrust digital library. In Proceedings of the ASIST Annual Meeting (Vol. 51). John Wiley and Sons Inc. https://doi.org/10.1002/meet.2014.14505101098

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free