February 12, 2009
Work for the Patent Office, for free!
The Peer-to-Patent system created by Beth Noveck's group at NYU Law School and being piloted by the U.S. Patent Office has gotten a fair bit of attention. The basic idea is to gather user-contributed content from experts who can help patent examiners figure out whether a proposed invention is novel (no prior art). Anyone can submit comments on the posted patent proposals, and in particular can cite to evidence of prior art (which generally leads, if valid, to denial of the patent application). The purpose is to speed up patent reviews, and in particular to help prevent the granting of invalid patents, because it is often costly, time-consuming and chilling to later innovation to fight and prove a granted patent is invalid.
Andy Oram wrote an editorial in the Feb 2008 Communications of the ACM urging computer scientists to participate (viewing article may require subscription). He explained the system, and why it would be good for innovation for experts to donate their time to read and comment on patent applications.
Why would experts -- whose time is somewhat valuable -- want to do this? Andy argues that the primary reason is public service: donate to create a public good (better software patent system) for all. There are lots of ideas of things that would be "good for all" that require volunteer donations of time, effort, money. It's actually not a given that such public goods are a good idea: the value of a public good does not always or automatically exceed the cost of the time or other resources donated by the people who created it. The experts who Andy seeks to contribute to Peer-to-Patent are highly trained people whose time is generally valued quite highly. In any case, if P-to-P depends on volunteer contributions by experts, how likely is it to succeed? These are people who already feel deluged by requests to volunteer their time to referee conference and journal articles, advise students on projects, advice government, serve on department and university committees, serve on professional organization committees and edit journals, etc., etc. I know few serious, successful academics who work less than 50 or 60 hours a week already.
Andy also suggests another reason to volunteer time for Peer-to-Patent: the bad patent you block may save your startup company! Now we're talking....a monetary incentive to "volunteer" time. But this is a bit problematic too: it points out a strategic concern with P-to-P. Potential competitors, or entrepreneurs who at least want to use the disclosed invention, have an interest in trying to block patent applications, and may try to do so even if the invention is legitimate? They can flood the Patent Office with all sorts of "prior art", which may not be valid, but now the patent examiners will have more work to do. And just as patent examiners may conclude incorrectly that a patent application is valid, so may they conclude incorrectly that one is invalid. It's not prima facie obvious, especially given that those most motivated to "donate" time and effort are those who themselves have a financial stake in the outcome, that user-contributed content in this setting will be a good thing, on balance.
March 29, 2008
Presentation at Yahoo! Research on user-contributed content
Yahoo! Research invited me to speak in their "Big Thinkers" series at the Santa Clara campus on 12 March 2008. My talk was "Incentive-centered design for user-contributed content: Getting the good stuff in, Keeping the bad stuff out."
My hosts wrote a summary of the talk (that is a bit incorrect in places and skips some of the main points, but is reasonably good), and posted a video they took of the talk. The video, unfortunately, focuses mostly on me without my visual presentation, panning only occasionally to show a handful of the 140 or so illustrations I used. The talk is, I think, much more effective with the visual component. (In particular, it reduces the impact of the amount of time I spend glancing down to check my speaker notes!)
In the talk I present a three-part story: UCC problems are unavoidably ICD problems; ICD offers a principled approach to design; and ICD works in practical settings. I described three main incentives challenges for UCC design: getting people to contribute; motivating quality and variety of contributions; and discouraging "polluters" from using the UCC platform as an opportunity to publish off-topic content (such as commercial ads, or spam). I illustrated with a number of examples in the wild, and a number of emerging research projects on which my students and I are working.
March 02, 2008
Looking for (well-paid, highly-trained, very busy) volunteers
The Peer to Patent project is one of my favorite examples of a user-contributed content (UCC) project recently, not because it has been very successful (yet), but because it demonstrates the surprising and important ways that UCC may go to benefit society. It's no all Wikipedia and social networking!
Peer to Patent is a project started by Prof. Beth Noveck and her Do Tank group at NYU Law School. The US Patent Office adopted it for a one-year pilot starting 15 June 2007. It is a system to post patent applications for public comment, in particular seeking suggestions about possible prior art, to assist USPTO examiners determine whether a patent should be granted. It was motivated by a widely held sense, particular in the area of software and business process patents, that the USPTO has been overwhelmed with the number of applications and the advances in technology in recent years, and that many and patents have been granted which can have the effect of stifling new innovation. During the first six months of the pilot, over 1800 people have registered to participate, and over 150 prior art references have been submitted on 24 patent applications that can be reviewed through this system.
In the February 2008 issue of the Communications of the ACM, Andy Oram published a column about the project in which he discussed the incentives challenges that may stand in the way of success. First, of course, not just any user is likely to be able to make quality contributions: to be useful, a contributor must have serious expertise in the area of the patent in order to be able to understand the application well enough to recognize possible prior art, and must know the literature well enough to identify the prior art. That's not a lot of people, and they aren't the type who have a lot of underpaid hours to volunteer. Indeed, he quotes Jon Bentley of Avaya Labs who points out that the whole essence of patenting is making money, and that the people in the best position to contribute may be those least interested in doing so.
One of the hopes of the project is that it is the monetary incentive itself -- not provided by Peer to Patent, but indirectly -- that will induce people to contribute: competitors. That is, if some company is using technology on which a patent is proposed, or is developing something similar, it will have a financial interest in seeing that the patent is not granted. Thus, they might be the ones to put the time in to review the application and propose the prior art. Although they are interested parties, as Oram says "prior art is prior art no matter who finds it".
Interesting problem, and I'm looking forward to seeing whether or not Peer to Patent can succeed (and I hope it does, because I tend to think that too many software and business patent applications are approved).
January 13, 2008
All user-contributed, all the time (almost)
I've been fascinated for the past couple of years with businesses that rely on user-contributed content (UCC) for substantial inputs to production. It is sometimes jokingly referred to as the "Tom Sawyer business model": get your friends to whitewash the fence for you, without paying them (in fact, they paid Tom quite handsomely, including "a key that wouldn't unlock anything, a fragment of chalk...and a dead rat on a string").
Randall Stross writes in today's New York Times about two fairly well-known businesses that have nearly perfected the art: Plenty of Fish, and Craigslist. Craigslist is a wide-open classified advertising service where employers post jobs, homeowners sell their old "Monopoly -- Star Wars Version" games and unwanted gifts, and, most piquantly, people of every shape, age, color and preference seek partners for a nearly infinite variety of polymorphously perverse, chaste and romantic interactions. Craigslist is one of the top 10 visited English language sites, has versions for 450 localities in over 50 countries, and runs with only 25 employees. All of the content is written, edited (such as it is) and maintained voluntarily by users; user volunteers also provide most of the customer service through help forums.
Plenty of Fish is more specialized and not quite as successful, but perhaps more remarkable. It is a dating service localized to 50 Canadian, US and Australian cities. Markus Frind created it and devotes only about 10 hours a week to running it...and he only in the past year hired his first employee. Yet the site has 600,000 registered users (which grows rapidly despite purging 30,000 inactives a month), and receives 50,000 new photos per day. Spam-filtering of text is done by software. Filtering of photos (to make sure they are human and clothed) is done by user volunteere: in the past year the top 120 volunteers scanned over 100,000 photos each! The users provide the customer service too, through help forums.
Great business model: have the users whitewash the fence, and you work 10 hours a week for $10 million in annual profits (Stross estimates that Frind's claim about his advertising-only profits is plausible). What are the generalizable principles. How can *I* start such a business and succeed (the road is littered with UCC-driven businesses that never turn a profit).
It is obvious that one of the most important questions is why? Why would users volunteer the time and effort to provide the content, the customer service, the photo filtering, etc.? You may think it's obvious why users want to visit Plenty of Fish: there are a lot of lonely hearts out there. And it is 100% free to users: Frind only charges advertisers. Of course, without user effort, it won't succeed: there will be no information about potential life partners, no help information, and lots of undesirable photos polluting the service. But no individual user needs to contribute anything: there is no requirement for volunteer hours (as there is at our local food coop), there is no public tracking of effort and peer pressure to pull your weight. It's a free-rider's dream.
Contributing content is easy: if you don't submit a profile you aren't going to get any dates. But what about photo scanning? Yes, you want to scan photos anyway: that's why you're there. But why not let someone else filter out the junk so you only have to filter the worthwhile photos? Is there that much of a first-mover advantage that you are willing to filter 100,000 photos per year to have a shot at being the first to contact the newest hunk? My guess is that the expected return on that investment is pretty low.
And why spend your time providing free help service to other users? Maybe Plenty of Fish is lucky to have a demographic for whom the value of time is unusually low (lonely single people with nothing else to do on Saturday night), but that just means the cost is lower to make the contribution: what is the benefit? Is it that the volunteer helpers are trying to be noticed as helpful, well-informed web geeks as a way of attracting dates?
I think the answers to these questions are transparently not obvious. If the answers were easy, we'd have a lot more people working 10 hours a week to make $10 million per year. And the answers are not likely to be something that involves only traditional economic views about incentives and motivations. Developing generalizable principles about the motivations for user-contributed content will surely need to draw on psychological explanations as well, from the psychology of personality and self, and social psychology (at least).
July 10, 2007
The social psychology of Facebook?
John Kirriemuir wrote a casual entry in his blog about the "psychology of Facebook". It is a lighthearted piece, but thoughtful. He suggests various informal hypotheses about why they spoke is succeeding, focusing in particular on the effort people make to grow their networks.
I would like to start learning about social psychology theory and what it might usefully say for incentive-centered design of information systems. My expertise in ICD is largely grounded in individual utility maximization and game theory. I have been saying for the last couple of years that "social motivations" are clearly important for some of the fundamental issues (motivating people to contribute to public resources, motivating them to make effort sufficient to generate high-quality contributions, and motivating them not to misuse and open access platform for unintended purposes). But other than my instincts and casual observation, I have little to go on.
Kirriemuir is not a social scientist (and is clear that he is not claiming to be), and his article is also casual. The social motivations he suggests are not clearly enough to find to test them or generalize to other settings, and his analysis is ex post description, which really does not serve as explanation (in the sense of enabling us to predict or successfully designed in other settings). But he asks good questions, and I think he is right that humans respond to various predictable social motivations in ways that are important for the success or failure of different social information systems.
April 11, 2007
Incentives for bookmarking
My Ph.D. student, Rick Wash, together with Emilee Rader, has a new paper on incentives for bookmarking in del.icio.us. This paper will be appearing, after some revision, in ASIST 2007, as "Public Bookmarks and Private Benefits: An Analysis of Incentives in Social Computing".
In this study, based on in-depth field interviews of del.icio.us users, they conclude that
metadata reflecting who bookmarked a webpage better supports information seeking than free-form keyword metadata (tags). We explain this finding by describing differences in the way that the design of del.icio.us motivates users to contribute by providing personal benefits for bookmarking and tagging.
Incentives and tagging (Library Thing vs. Amazon)
Rick Wash pointed me to an interesting blog article about a comparison of book tagging on LibraryThing and on Amazon. The basic fact asserted: tagging is wildly successful on LibraryThing, and has barely had any meaningful usage on Amazon.
The more interesting point for us: why? The author suggests that the incentives are aligned much better at LibraryThing. At some level, that's tautologically true, but what we might learn from is what the incentives are.
The rather obvious, but important point the author makes (but here in the pithier words of one of the commenters on the post):
people do stuff on the Internet that is useful to them, not out of the desire to make a nifty tagsonomy.The result may be that a very valuable public good is created (which is true at LibraryThing), but it usually created because the individuals contributing were getting enough value for themselves. This is the compelling logic behind the private provision of public goods.
On LibraryThing, people are cataloguing their own book collections, for their own purpose. Tagging creates organization, that can be used for sorting, reporting, finding. This same motivation is at work on flickr (photos) and del.icio.us (bookmarks).
On Amazon, people are searching to buy books they haven't read: what gain to them from tagging them? (Some suggest tags can be used to create complex categorized wish lists, but how much value do they add to the flat wish list, when few people realistically keep more than a dozen or two items on their wish list (and tagged structures are not easily viewable by potential gift givers).)
Of course, as striking as this example is, and apparently compelling the logic, it is not so easy to explain all user-contributed content. One obvious relevant example: Why are people spending so much time writing book reviews on Amazon? Surely not primarily to create a set of notes to jog their memory later about what they thought about a book?
November 30, 2006
Yelp: Local reviews via social networking site: why contribute?
So, reviews of local businesses written by local patrons are popular. Why not? Newspapers have always done well running "Best of ___" or "Reader's Choice" contests. Now we have Yelp.com, Judy's Book, Intuit's Zipingo, Insider Pages, and offerings from Yahoo!, Microsoft Live and other players. Even our small city (Ann Arbor, MI) has about 250 businesses reviewed by the newest entrant, Yelp:
And the venture capitalists are giving the new players some dough.
But, why? These sites will make revenues if they sell ads, which should work if there are eyeballs since the eyeballs will be looking specifically for businesses in the local area so advertising on the page should have a good return. But to get eyeballs, these sites have to get volunteer labor to enter ratings and write reviews. And those volunteers come from a diffuse group of local business patrons, many of whom don't know from Web 2.0, and even fewer know about Yelp.com. And even if they know, what's in it for the volunteers?
It's possible that these Web 2.0 companies are simply using Incentives 1.0: They could hire paid reviewers who at least seed the site with reviews on a number of popular businesses in each city. Yelp and the others claim that they don't do this: "real reviews from real people" (I guess we're supposed to assume that paid employees are not real people). But how would users know if they did? What forfeitable bond is Yelp posting to convince us they are trustworthy? Or if they bribed "real people" to do reviews by sending a salesperson to the establishments and handing out bling in exchange for promises to enter a review?
There's another old-school way to get review content generated, too: tell the business owners about your site, and they'll take the initiative to write their own reviews (the "Amazon" problem). And so that they look popular -- not just loved by one critic -- they ask their mothers and cousins to submit reviews too. Again, how could we tell?
November 29, 2006
Research presentation: Web 2.0 and ICD
On 20 Nov 06 I gave an invited plenary Association Lecture at the Southern Economic Association Annual Conference in Charleston, SC. The title was "Getting the good stuff in, keeping the bad stuff out: Incentives and the Web". Here are the slides (not PowerPoint!).
In this talk geared to professional economists I explained the user-contributed content explosion that is one characteristic of so-called Web 2.0, and showed that this is happening through all phases of information production, organization, retrieval and use. I then discussed three fundamental economic issues that arise with user-contributed content: getting the good stuff in (private provision of public goods); keeping the bad stuff out (pollution); and evaluating the stuff (signaling, reputation). Familiar topics to the hordes who read this blog!
I finished with a simple elaboration to illustrate how ICD methods could be used to design mechanisms for dealing with these problems. The model is based on an event that occurred last spring on Digg.com.
May 19, 2006
Volunteer grid computing projects
Most people have heard of SETI@Home, the volunteer distributed grid computing project in which computer owners let software run on their machine when it is idle (especially at night) that helps search through electromagnetic data from space in an effort to find communications from extra-terrestials. But this is only one of many such projects; over a dozen are described in "Volunteer Computer Grids: Beyond SETI@home" by Michael Muchmore, many of them devoted to health applications.
Why do people donate their computer cycles. At first glance, why not? These programs, most of which run BOINC (Berkeley Open iNfrastructure for Networked Computing), are careful to only use CPU cycles not in demand by the computer owner's software, so the cycles donated are free, right? Well, sort of, but it takes time to download and install the software, there is some risk of infecting one's machine with a virus, many users may perceive some risk that the CPU demands will infringe on their own use, etc. Most users will believe there is some amount of cost.
With certain projects, volunteers may get some pleasure or entertainment value out of participating: for example, the search for large Mersennes primes is exciting to those who enjoy number theory; searching for alien intelligence probably provides a thrill to many.
I suspect a related motivation is sufficient for most volunteers: the projects generally have a socially valuable goal, so people can feel like they are helping make the world a better place, at a rather small cost to themselves. For example there are projects to screen cancer drugs, search for medications for tuberous sclerosis, and help calibrate the Large Hadron Collider (for physics research). As Muchmore writes, "a couple of the projects—Ubero and Gómez—will pay you a pittance for your processing time. But wouldn't you feel better curing cancer or AIDS?"
These projects appear to attract a lot of volunteerism. Muchmore reports estimates of participation that range from one to over five million computers at any given moment. According to the BOINC project, volunteers are generating about 400 teraflops/second of processing, far more than the 280 tps that the largest operational supercomputer can provide.
March 14, 2006
Digg, Google News...User-contributed "news"
I'm developing an interest in the phenomenon of user-contributed content, and the two fundmental incentives problems that it faces: pollution (keeping the bad stuff out) and the private provision of public goods (inducing contributions of the good stuff). User-contributed "news" is one example to explore.
Digg.com is one currently hot user-contributed news site:
Digg is a technology news website that combines social bookmarking, blogging, RSS, and non-hierarchical editorial control. With digg, users submit stories for review, but rather than allow an editor to decide which stories go on the homepage, the users do.
Slashdot of course is the grande dame. Digg and Slashdot both rely on multiple techniques of community moderation to try to maintain the quality of content (keep out the pollution). For example, proposed stories for Digg are not promoted to the homepage until they have sufficient support from multiple users; and users can report bad entries (apparently to a team of human editors).
How effective (and socially costly) are these community moderation techniques? By now we've all heard about Wikipedia founder Jimmy Wales manipulating his own Wikipedia entry, which led to publicity about multiple members of Congress, etc., who have been doing the same thing.
And even if a site has an efficient moderation system to filter out pollution, there is still the problem of inducing people to volunteer time and effort to contribute to the public good by creating valuable content. Obviously, this can happen (see Slashdot, Wikipedia). But suppose you are designing a new user-contributed content service: how are you going to create a community of users, and how are you going to induce them to donate (high quality) content?
Spamming Google News: Who's in, who's out?
An old acquaintance of mine, Rich Wiggins, recently blogged about his discovery of how easy it is to insert content in Google News. He discovered this when he noticed regular press releases published in Google News that were a front for the musings of self-proclaimed "2008 Presidential contender" Daniel Imperato. Who?
Wiggins figured out how Imperato did it, and tested the method by publishing a press release (screen shot) about his thoughts while celebrating his 50th birthday in Florida. Sure enough, you can find this item by searching on "Rich Wiggins" in Google News.
This is (for now) a fun example of one of the two fundamental incentives problems for important and fast-growing phenomenon of user-contributed content:
- How to keep the undesirable stuff out?
- How to induce people to contribute desirable stuff?
The first we can call the pollution problem, the second the private provision of public goods problem. Though Wiggins example is funny, will we soon find Google News polluted beyond usefulness (the decline of the Usenet was largely due to spam pollution).
Blogs, of course, are a major example of user-contributed content. At first glance, they don't suffer as much from the first problem: readers know that blogs are personal, unvetted opinion pages, and so they don't blindly rely on what is posted as truth. (Or do they?) But then there's the problem of splogging, which isn't really a problem for blogs as much as for search engines that are being tricked into directing searchers to fake blog pages that are in fact spam advertisements (a commercial variant on the older practice of Google bombing).
There is a lengthy and informative Wikipedia article that discusses the wide variety of pollution techniques (spamming) that have been developed for many different settings (besides email and blogs, also instant messaging, cell phones, online games, wikis, etc.), with an index to a family of detailed articles on each subtype.