June 06, 2013
Everything can -- and will -- be manipulated
Well, not "everything". But every measure on which decisions of value depend (e.g., book purchases, dating opportunities, or tenure) can and will be manipulated.
And if the measure depends on user-contributed content distributed on an open platform, the manipulation often will be easy and low cost, and thus we should expect to see it happen a lot. This is a big problem for "big data" applications.
This point has been the theme of many posts I've made here. Today, a new example: citations of scholarly work. One of the standard, often highly-valued (as in, makes a real difference to tenure decisions, salary increases and outside job offers) measures of the impact of a scholar's work is how often it is cited in the published work of other scholars. ISI Thompson has been providing citations indices for many years. ISI is not so easy to manipulate because -- though it depends on user-contributed content (articles by one scholar that cite the work of another) -- that content is distributed on closed platforms (ISI only indexes citations from a set of published journals that have editorial boards which protect their reputation and brand by screening what they publish).
But over the past several years, scholars have increasingly relied on Google Scholar (and sometimes Microsoft Academic) to count citations. Google Scholar indexes citations from pretty much anything that appears to be a scholarly article that is reachable by the Google spiders crawling the open web. So, for example, it includes citations in self-published articles, or e-prints of articles published elsewhere. Thus, Google Scholar citation counts depends on user-contributed content distributed on an open platform (the open web).
And, lo and behold, it's relatively easy to manipulate such citation counts, as demonstrated by a recent scholarly paper that did so: Delgado Lopez-Cozar, Emilio; Robinson-Garcia, Nicolas; Torres Salinas, Daniel (2012). Manipulating Google Scholar Citations and Google Scholar Metrics: simple, easy and tempting. EC3 Working Papers 6: 29 May, 2012, available as http://arxiv.org/abs/1212.0638v2.
Their method was simple: they created some fake papers that cited other papers, and published the fake papers on the Web. Google's spider dutifully found them and increased the citation counts for the real papers that these fake papers "cited".
The lesson is simple: for every measure that depends on user-contributed content on an open platform, if valuable decisions depend on it, we should assume that it is vulnerable to manipulation. This is a sad and ugly fact about a lot of new opportunities for measurement ("big data"), and one that we must start to address. The economics are unavoidable: the cost of manipulation is low, so if there is much value to doing so, it will be manipulated. We have to think about ways to increase the cost of manipulating, if we don't want to lose the value of the data.
July 20, 2009
Fines: cosmetic incentives?
The State of New York announced settlement of a lawsuit it filed against LifeStyle Lift for "astroturfing" (paying its employees to "flood" the net with false positive reviews). The company will pay a $300,000 fine, plus an undisclosed amount of New York's legal costs.
Lifestyle Lift is a facial cosmetic surgery procedure that purports to be quicker and safer than traditional facelift procedures, with shorter recovery time and cost.
According to the NY State Attorney General's office, employees published anonymous reviews to the web to trick potential customers. They did this on legitimate review sites, and they also created standalone web sites that purported to be independent, where they created all of the "reviews" or edited reviews by third parties to skew the discussion.
See also this New York Times story.
Laws that impose possible fines or other punishments (such as jail time) are an incentives-based approach to shape behavior. A simplified version of the idea is that if the expected cost of the punishment, times the likelihood that the agent will be caught and punished (discounted to present value), is greater than the expected benefit from the improper behavior, it will not be in the agent's self-interest to engage in the behavior.
One concern about using legal punishment incentives is that they involve multiple sources of uncertainty (about punishment size and likelihood of being caught and punished), and that seemingly large ex post punishments may not be that much of an ex ante deterrent.
Lifestyle Lift was fined $300,000 plus legal costs. Suppose that it had known with certainty that it would have to pay this fine several years after earning money as a result of publishing false reviews. Would it have chosen to be honest? That depends, of course, on how many consumers it falsely induced to get the procedure, and the profit on the procedure. According to current customer comments on one review site that claims to have been abused (RealSelf.com), the procedure costs on the order of $5000, only some portion of which will be profit. Suppose that the profit rate is 10% (about $500): then of the "nearly 100,000" customers it claims to have served, Lifestyle Lift would have had to falsely induced at least 600 of them. If many more than 600 had been tricked, then even knowing the fine would occur may not have been sufficient deterrent. Multiply that by the uncertainty and the number of customers they had to successfully trick might have been less (there were also uncertainties about the benefits of lying that would have to be taken into account).
There is at least one reason the incentive might be greater: harm to Lifestyle Lift's reputation. For example, this settlement was reported in the New York Times, and the story is starting to circulate through blogs and other information sources.
On RealSelf.com, where presumably the false reviews have now been removed, 65% say that the procedure is not worth it. Meanwhile, Lifestyle Lift now posts a badge and promised "Internet Code of Conduct" on its own web site, stating that it "is proud to take a leadership role in establishing new standards of Internet conduct and communications." I don't know when that "code" first appeared, but it seems likely that this is an example of trying to turn lemons into lemonade.
July 17, 2006
ICD to establish trust for transactions
Huberman, Wu and Zhang just published an article in Netnomics called "Ensuring trust in one-time exchanges: the QoS problem". (Folks at UM can access through the library's "Electronic Journals" access page.) They are interested specifically in the problem of purchasing from an IT service provider who does not have a verifiable reputation; this is a generic problem in transactions, but their modeling makes specific assumptions for this particular situation.
Summary from the paper:
In our model, a quality of service contract describes the likelihood that the service provider delivers the promised service. We have designed a mechanism that forces the provider to reveal his true assessment of the probability that he will be delivering a given service in a single interaction with a user/customer. We also solved the complementary truthtelling reservation problem of obtaining from the user his assessment of the true probability that a given level of resources will be required at the time of their delivery. In both cases, our mechanisms use a contingent contract to elicit true revelation of both QoS and likelihood of use through a pricing structure that forces the parties to make accurate assessments of their ability to do what they commit to.
They also apply the problem to situations in which service providers might overbook.