Updates to our UM OAI provider

October 05, 2009

We have been making improvements to our OAI provider (UMProvider). We host the metadata for HathiTrust public domain texts through the provider, as well as all the metadata for text and image collections in the UM Digital Library.

Our first improvement was to make it faster to harvest. Our provider uses mySQL tables to store, sort and provide access to the metadata. Our method for sorting the data was one of the causes for the slowness of the harvesting.

Our second improvement comes from our investigation into the increasing number of deleted HathiTrust records that were showing up in the provider, and a discrepancy between the number of records in the provider and the number of records in our HathiTrust databases. We have not fully determined the cause of this, but we have been able to restore over 30,000 HathiTrust records that were marked as deleted in the provider.

Consequently, we recommend you harvest the provider from scratch, whether the entire metadata set or a particular set. It will be quick, and you'll get those missed records. We will keep you posted on further improvements.

(The UMProvider can be accessed via http://quod.lib.umich.edu/cgi/o/oai/oai?verb=ListRecords&metadataPrefix=oai_dc. There is useful information about the HathiTrust records in the provider at http://www.hathitrust.org/data.)

Posted by Kat Hagedorn at 10:12 AM. Permalink

Comments

Login to leave a comment. If you don't have already have a University of Michigan uniqname, create a Friend account -- all you need is a valid email address.