Browsing in MBooks?

June 18, 2008

Last month I attended the annual Digital Library Federation spring meeting and David Rumsey, renowned for his collection of historical maps, was one of the keynote speakers. Prompted by David Rumsey’s map ticker (http://www.davidrumsey.com/ticker.html) and what he said in passing about "moving among the maps" in Second Life, I’ve been brooding about the perceived lack of browsability in the digital library context. How would we "move among the books" in MBooks?

Presumably, one way we could do it would be to make a book ticker – perhaps with covers or title page thumbnails, arranged in call number order (as one would browse a shelf).

That raises a few immediate practical questions:

1. Do we have identified title pages or cover thumbnails for all the books? What do we do for cases where we don’t?
2. Should we precompute thumbnails or try to derive them on the fly?
3. Can we use the Mirlyn call number to browse? They aren’t in the MARC record per se.

These practical questions raise a number of other usability issues, of course. Some are about thumbnails – what size would the thumbnails have to be to make them useful? When you clicked on them, where would you end up? Could you hover over them and see some volume metadata? Can we show thumbnails for in-copyright items? Others are about call number browsing – would you really want to browse all items by call number, or just those from a given library? That is, browse the "real" stacks for a holding location, like Shapiro Undergraduate Library, or the superset of all libraries, the stacks as they’ve never been in the physical world?

To me, the latter choice seems like the best one – it’s something that is only possible in a digital library, as we’d be drawing together items that are housed in separate buildings yet may be related. How do you imagine browsing in the digital library?

Posted by Chris Powell at 04:23 PM. Permalink | Comments (3)

Google Book Search links in Mirlyn

June 13, 2008

You may have noticed that the links to Google Books in Mirlyn have a little more information lately. We have always provided links to online copies in both Google Book Search and MBooks. We're now using the Google API to provide links to any book in Mirlyn that is also in Google Book Search.

We provide a thumbnail image of the cover or title page (although there's been some controversy about this lately). In addition, we also tell you what level of access you can expect if you follow the link to Google Book Search. Google Books has three levels of access, while MBooks has only two:

Google Book Search termsMBook terms
Snippet viewSearch Only
Limited view
Full-textFull Text

In Google Book Search, "Snippet view" means that you cannot view the full-text, but can see up to three text snippets; "Search Only" in MBooks means that you can search for keywords, and discover where all the matches occur, but can't view the pages. (See this previous post for more about "Search Only.") "Limited view" means that the book is part of Google's Publisher Partnership, and a limited number of pages is available for reading. You won't be able to see the entire book, but you will have access to a significant number of pages. "Full-text" in Google Book Search means that you can view the entire text, and get a PDF file of the entire text, while "Full Text" in MBooks means that you can view the page images using the MBooks pageturner, and get a 10-page PDF excerpt.

If you look at very many records for MBooks in Mirlyn, you will soon note that in some cases the access levels differ between MBooks and Google Book Search.

In this last example you'll have full-text in either Google Books or MBooks, so you can decide which interface you prefer. Knowing how to read the Mirlyn record will help you find the best access for any given book. Happy reading!

--Perry Willett
--Head, Digital Library Production Service

Posted by Perry Willett at 09:13 AM. Permalink | Comments (11)

Preview of the new Collection Builder tool

June 09, 2008

Over the past year we've been developing a new collection building tool to be used in conjunction with the MBooks "page-turning" application already available. This tool will allow users to create their own collections of MBooks items and view public collections created by others. Users will also be able to do full text searching across all items within a collection.

We're still working out some bugs and interface issues but hope to release soon. Check back in July!

MBooks preview

Posted by Kat Hagedorn at 11:36 AM. Permalink | Comments (5)

MTagger Usability Research

June 08, 2008

The Usability Working Group (UWG), along with our 2 fantastic and hardworking interns, is spending the summer conducting usability research on MTagger. We started by doing a heuristic evaluation and cognitive walkthrough. The goal for these evaluations was to reveal a preliminary set of issues pertaining to the usability, functionality and aesthetics of MTagger and to facilitate prioritizing further benchmarks. This report is now online.

We've also completed a "guerilla" test and we're in the process of conducting interviews and preparing for formal user tests and a survey. Reports for those studies will also be put online when they're done.

Link to MTagger Usability Reports

- Suzanne Chapman
-- UWG chair/DLPS Interface & User Testing Specialist

Posted by Suzanne Chapman at 02:22 PM. Permalink | Comments (1)

Page numbers and URLs in MBooks

June 06, 2008

We get questions from MBooks users (most recently from dfulmer in the comments to this post) about how to link to pages, what the URL parameters such as "num" and "seq" mean, and other questions about links and page numbers.

There are a couple of issues. The first is about URLs. The most stable and persistent URL is the one that we include in the Mirlyn record, and also at the top of the pageturner with other descriptive metadata. It's called a "handle" and is a robust persistent identifier managed by CNRI (more on handles at http://www.handle.net/). They look like this:

http://hdl.handle.net/2027/mdp.39015021038404

and this is the URL that we encourage people to use and save. However, since they all start with http://hdl.handle.net/2027, people don't recognize them as belonging to the University of Michigan. Users are much more familiar with URLs that include the umich.edu domain. Nevertheless, since these handles are persistent and robust ("2027" is registered with CNRI as belonging to us) these are the URLs that should be used.

Other URLs will be less stable. The sharper-eyed among our readers will have noted that our URLs recently changed from starting with "mdp.lib.umich.edu" to "sdr.lib.umich.edu". We will redirect users any time they use a URL starting with "mdp.lib.umich.edu" but these local domain names will change over time. The same is true for the URL parameters such as "page," "num," "seq," "orient," etc. Phil Farber's response to the same post noted above provides documentation on what these mean, but be aware that these will change without warning. URL hacking will lead to tears before bedtime.

The other related issue has to do with page numbers and other metadata. People will notice that many MBooks include a table of contents with page numbers on the left-hand side, such as this one. You may also notice that some books lack this table of contents, and use "sequence" instead of page numbers. Here's an example of a book for which we do not have page numbers.

It all has to do with the metadata. At a minimum, we know the sequence in which the pages of any given book should be displayed. The pageturner buttons for forward and backward use this information to work properly, but for some books, this is all the information we have. Since the sequence of pages starts with the front cover, it's unlikely that the sequence number will match the actual page number. (And as Suzanne noted in her comments to this post, if someone has a better term than "sequence" please let us know!) Many of these books without page numbers were early efforts by Google; they are sending us newer, better versions of these books, so eventually the entire collection will include page numbers.

In many (soon most or all) cases we will have page numbers, along with additional metadata identifying title pages, tables of contents, first pages of sections, and other page features. We get these metadata from Google. We don't know how Google generates them, but it's undoubtedly an automated method. This means that they won't be perfect. When we do have metadata indicating the title page, we will open the book to the title page as a default. If we don't have any metadata about the title page, we will open to the first image (usually the front cover).

Page numbers are, to quote the kids, whack. In some books, they are out of sequence, or repeated, or misnumbered, or missing. With many journals, the library has bound together two or more issues, each with its own pagination from 1 to whatever. Therefore, the online volume could have multiple pages numbered 207, as in the example that David points to in his comments to the post mentioned above. Right now, MBooks will take you to the first instance of p. 207 if you type that into the "goto" box. We could probably do something to alert people to the fact that there are multiple pages numbered 207, and give them links to each of them.

We need to consider having persistent URLs to individual pages. People want to refer to individual pages, and we should have a method with a stable URL to allow them to do it. We could also do more to have a predictive method of referring to a page. Ed Vielmetti recently wrote some ideas about this in his blog.

We will look at this more carefully soon, once we get through the current round of development for collection builder and other new features.

--Perry Willett
--Head, Digital Library Production Service

Posted by Perry Willett at 10:38 AM. Permalink | Comments (0)

New REST-ful API for Mirlyn

June 02, 2008

Earlier this week, I had a chance to give a brown-bag session on a new API into our catalog, Mirlyn (Ex Libris's Aleph software).

One of the great things about working at a library is the depth and breadth of data at our disposal. One of the more frustrating things is how terribly locked-up all that data is.

What, I wondered, would happen if I could radically lower the bar of entry to the catalog for programmers of even marginal ability? The University of Michigan has a pretty big collection, and there's no telling what people could do with that data if getting at it was a lot easier, if they didn't need special permission or access to a particular machine, and if it was useful inside the browser using Javascript as well as in server-side operations?

So I went about trying to create a system that fulfilled, at least partially, those criteria. Unlike many ILS systems, Aleph already provides a whole suite of interfaces, including an XML-based API they call the XServer. Unfortunately, the XServer has, in my opinion, a number of shortcomings:

  • As its name suggests, it's based on XML, which can be confusing to deal with to the uninitiated. Remember, my focus is on weekend programmers, maybe just writing javascript inline in an HTML document.
  • URLs for an XServer search don't mean anything. First you do a search, and get back a search set. Then you ask for some records using that search set in the URL. It's essentially a random identifier, and looking at the URL doesn't tell you anything about what search was done or what you're getting, and you're sure not going to construct one by hand.
  • The interface is...messy. It's clearly a system that grew up organically, and there are a lot of inconsistencies concerning how things are named, what parameters are called, etc. I wanted an interface where you could take a good guess at what the URL should look like and be right 90% of the time.

When I was all done, I had a system that supports queries like this:

A book by ISBN:
http://mirlyn.lib.umich.edu/cgi-bin/api/basic.json/isbn/097669400x
Or a couple:
http://mirlyn.lib.umich.edu/cgi-bin/api/basic.json/isbn/0596000278;097669400x?records=all
The most recent 10 books by anyone named 'Bonk'
http://mirlyn.lib.umich.edu/cgi-bin/api/basic.json/author/bonk?records=1-10
And how about the next ten?
http://mirlyn.lib.umich.edu/cgi-bin/api/basic.json/author/bonk?records=11-20

You can search by title, author, keyword, most standard numbers -- just about anything you can use to search the catalog via the website. The full list of searchable indexes and their aliases, as well as all the current Mirlyn API documentation, is on the new MLibrary API wiki.

While all the above examples return a subset of available data in the JSON format, you can also return XML if you're more comfortable with it, and besides the "basic" data you can get circulation status or full MARC records (expanded into either XML or JSON). Just replace "basic.json" in the above URLs with something like "marc.xml" or "circstatus.json".

There's still a lot to do (allow user-defined sorting, let people browse by callnumber, etc.) but it works and is useful and is certainly friendly, in enough ways, that people can start digging into it if they want.

I've put some simple javascript examples on the MLibrary Labs page; check them out, and drop me a note (or, better yet, comment here!) if you have questions or ideas.

Posted by Bill Dueber at 08:51 AM. Permalink | Comments (1)