Google Still Not Indexing Hidden Web URLs

July 22, 2008

Read our recent article in D-Lib Magazine:
http://dx.doi.org/10.1045/july2008-hagedorn.

This report is a follow-up to the McCown et al. article in IEEE Internet Computing two years ago [1], in which the researchers investigated the percentage of URLs from OAI records in Google, Yahoo and MSN search indexes. We were interested in whether Google in particular had increased the number of OAI-based resources in its search index.

Google's indexing does not seem to have retrieved more of the hidden web since the publication of the McCown, et al. article in 2006. We would venture to conclude that Google has not endeavoured to increase their support and access to OAI materials. Even taking into account the caveats in our report, we would also conclude that aggregations of OAI records are as valuable for user research purposes as they were at least two years ago.

[1] McCown, F., Liu, X., Nelson, M. L., and Zubair, M. "Search engine coverage of the OAI-PMH corpus." IEEE Internet Computing 10:2 (March/April 2006) pp. 66-73.

Posted by Kat Hagedorn at 04:38 PM. Permalink

Comments

Login to leave a comment. If you don't have already have a University of Michigan uniqname, create a Friend account -- all you need is a valid email address.