HathiTrust Accessible Interface
October 20, 2009
For the past two years the University of Michigan Library has been making many of our digitized texts (including items that are in-copyright) available to persons with print disabilities through the HathiTrust Digital Library . Our Dean, Paul Courant, recently posted about this project on his blog so I thought it might be nice to offer more background and some technical information about this project.
In order to determine the best method for the system, we began by conducting research in a number of areas. We explored the technology that users with print disabilities often use to access web content (primarily via screen reader applications and assistive technologies like digital Braille devices), researched accessibility related coding techniques, and met with the campus Services for Students with Disabilities (SSD). After weighing the pros and cons of different options, we decided to do a few things:
- Make our standard interfaces more accessible.
- Create a text-only book interface that is optimized for the specific needs of users with print disabilities (referred to internally as the "SSD interface").
- Create a system to grant additional access to the full-text of a digitized book for certain UM patrons, regardless of the book's copyright status.
Since HathiTrust is a multi-institutional and publicly available access system we deemed it very important to improve the accessibility baseline. This was done fairly easily by making better use of web standards-based coding techiniques like proper use of headings, separating style from content, etc. However, due to the structure of our current system, we were still only able to offer one book page at a time which results in a less than ideal experience for actually reading a book. So, after talking to some print disabled users, we determined that what was really needed was a simplified text-only interface that could be coded for optimal accessibility and display the entire concatenated text from beginning to end on one single web page. In order to do this, we needed to establish an access policy and authentication mechanisms. We accomplished this through collaboration with SSD, digital library developers & systems administrators, library managers, and Jack Bernard, the UM Assistant General Council. Once we worked out the access mechanism, implementing the interface was actually fairly simple. Since the HathiTrust Digital Library uses XML & XSLT, we just had to write one single XSLT style sheet to generate the code to any book as it is requested.
Here's how it works:
- UM student/faculty registers for the program with SSD.
- SSD notifies the Library, and the Library enables the patron's record for access.
- SSD Patron checks out a book that has been digitized (public domain or in-copyright).
- Library Catalog system (Mirlyn) automatically sends them an email containing a URL linking them to the SSD interface.
- Student follows the URL and is prompted to login.
- The system checks their eligibility in the program and that the book is checked out to them and then they're given access to the SSD interface for as long as they have the book checked out.
Here's what it looks like:
During every phase of this project we tried to get feedback from some potential users of this system. Services for Students with Disabilities put us in touch with a few students who piloted the project and provided feedback. We were also very lucky to be able to hire two UM School of Information graduate student interns to help work on this project. One student helped research coding techniques and drafted a set of departmental guidelines. The other, a blind student, conducted evaluations of the regular and SSD interfaces using a variety of assistive technologies. Additionally, we have worked with the National Federation of the Blind for input as well as a round of testing of the SSD interface which resulted in an official endorsement.
There are currently over 3 million University of Michigan volumes available via the HathiTrust. The use of the SSD interface is still relatively low, averaging about 35 pageviews a month but we hope this will increase as more books become available and more students learn about the service. Most of our HathiTrust Digital Library interfaces pass section 508 and WCAG priority 1 validation. We are currently working to get them all up to that standard and hopefully beyond.
- There is still much work to be done to ensure that our system is as accessible as possible. We are in a constant state of development and we are now beginning to collaborate more with other institutions so it is easy for edits to be done that cause the code to fail validation. As we continue to develop new tools and functionality, it will become more and more important to follow accessible coding conventions at all stages of development as it is extremely easly to fall out of order with many people working on different parts of the system.
- We are working with OCR content that isn't as complete or flawless as we would like so hopefully one day we'll be able to improve the quality, add descriptions for images, and allow users to suggest corrections.
- The full-text viewing feature is currently only available to UM students and faculty but we hope that through continued collaborations, a similar program will be established at other HathiTrust partner institutions.
- There are always improvements to be made! We welcome any comments or feedback about how to improve our system.
Updates to our UM OAI provider
October 05, 2009
We have been making improvements to our OAI provider (UMProvider). We host the metadata for HathiTrust public domain texts through the provider, as well as all the metadata for text and image collections in the UM Digital Library.
Our first improvement was to make it faster to harvest. Our provider uses mySQL tables to store, sort and provide access to the metadata. Our method for sorting the data was one of the causes for the slowness of the harvesting.
Our second improvement comes from our investigation into the increasing number of deleted HathiTrust records that were showing up in the provider, and a discrepancy between the number of records in the provider and the number of records in our HathiTrust databases. We have not fully determined the cause of this, but we have been able to restore over 30,000 HathiTrust records that were marked as deleted in the provider.
Consequently, we recommend you harvest the provider from scratch, whether the entire metadata set or a particular set. It will be quick, and you'll get those missed records. We will keep you posted on further improvements.
(The UMProvider can be accessed via http://quod.lib.umich.edu/cgi/o/oai/oai?verb=ListRecords&metadataPrefix=oai_dc. There is useful information about the HathiTrust records in the provider at http://www.hathitrust.org/data.)