June 14, 2011

Forget my motto, "Privacy, Identity, Responsibility". Everybody today would rather sing "Yo-ho-ho and a bottle of rum!" Whether intellectual property or confidential information, personally created data is a buried treasure which has turned us all into plundering pirates.

I need no better proof than when such otherwise sober scholars as Professor Marc Rodwin and Jane Yakowitz come out calling for a compulsory personal "data commons". This is an opportunity however. When I pen an open letter to Eric Schmidt or make a public suggestion to Mark Zuckerberg, I would be quite surprised to receive an answer. With well intentioned academics, perhaps I may trigger more fruitful exchanges.

At least we share a common starting point. To quote Marc Rodwin for instance, "[Electronic Medical Records] make it feasible to collect aggregate patient data that can be used to vastly improve medical knowledge, patient safety and public health" (*).

No doubt this is a treasure whose uncovering would add to the common good. Yet this requires different self-interested parties to cooperate in good faith, a challenge equal to the one faced by Trelawney as he looked for a ship and a crew to turn his map into gold. Though, contrary to Robert Louis Stevenson's squire, our academics are wiser to the ways of the world.

"Currently, the law does not clearly define property interests in patient data". Quite British an understatement, isn't it? Quick on Yochai Benkler's footsteps, they also know that, unlike any other goods, information is not destroyed by consumption.

And so Jane Yakowitz argues that researchers may as well get the personal data they covet since other plunderers have got to it first. "Given the data mining opportunities available on identifiable information from companies like Choice Point [sic](1), it is highly unlikely that [individuals will be put at risk by anonymized data contributed to the commons]" (**). Not to mention Catalist, Facebook and countless so-called cloud companies.

Marc Rodwin further "suggest[s] that private ownership of personal data would not eliminate the risks of violation of privacy". Indeed, whatever their privacy rights, patients must share personal data with their physicians and healthplans. And while such data merits extra protection, medical practice is far from perfect. Milt Freudenheim reports that "in the last two years, personal medical records of at least 7.8 million people have been improperly exposed, according to the government data" (***)(2).

From these all too real observations, Marc Rodwin and Jane Yakowitz draw a pragmatic conclusion. Public ownership of properly anonymized personal data promises to tap the treasure it represents for the public good while adding no significant risks for the individuals concerned.

Expediency however can hide behind pragmatism. My first objection, technical in nature, is perhaps best conveyed by means of an example.

"The March 11 earthquake and tsunami [...] left more than 24,000 people dead or missing", writes Martin Fackler (****). Not one of these victims died of irradiation. Yet should future deaths from radiation induced cancer be dismissed as insignificant? Matthew L. Wald on the contrary reports "officials in Japan agonize over what constitutes a safe radiation dose for people who live near the Fukushima Daiichi nuclear reactors" (*****).

When Jane Yakowitz blasts against "de-anonymization scientists" as "conceiv[ing] of privacy from an orientation that emphasizes any harm that is theoretically possible", isn't she misunderstanding the message conveyed by Latanya Sweeney, Paul Ohm and Vitaly Shmatikov? Take Paul Ohm and his "database of ruin" for instance (3). He does not predict a tsunami, he says anonymization acts as a leaky reactor. As one cumulates enough small profile exposures, one day one will suffer a fatal outcome, hostile re-identification. Rather than scare mongering isn't this a legitimate concern?

My second objection is moral. The existence of a greater evil does not whitewash a lesser one, even when the latter happens to be the means to deliver a good outcome in the end. If not, why bother asking patients for their consent before enrolling them in medical trials?

May I contribute a third objection in the name of logic? Either re-identification is a practical possibility or it is not. If it is, it kills privacy. If it is not, it destroys responsibility. For, if a patient profile is truly anonymous, how could factual errors be detected when the original source is forever lost? Can we base future research on irremediably defective data (4)? Whether Marc Rodwin and Jane Yakowitz are right or not, their advice is wrong.

My motto may matter after all. As Charles Leadbeater declares, "the big issues will be about ownership and control" (******). If we want to profit from our collective treasure, let us combine personal greed and social solidarity as the two stroke engine of human progress. Let us recognize eprivacy, the right of individuals to their personal data so that its legitimate owners may freely and responsibly contribute it to the public good.

In this light granting immunity to third party contributors of de-identified personal data to the commons, as advocated by Jane Yakowitz, would be a privateering of sorts, on paper legitimate, still piracy at heart. Forget buccaneers. But how should we answer Marc Rodwin's own objections?

"There is no need to create private property rights to encourage production of patient data because it already exists". Indeed, while a performance art, daily living cannot be withheld contrary to other intellectual creations. But one pays for property for other reasons. Royalties to surface dwellers cannot possibly create more oil in the ground. Yet were the US to stop paying them, they would become as cooperative as Captain Smollett's crew.

Compulsory compliance, the only alternative, is nonsense. Who would propose reducing Saudi Arabia to colonial status? Shouldn't US citizens deserve the same respect? Granted but will not at least some of them use their new rights to "demand exorbitant prices to purchase or use data that they control?" Whom Marc Rodwin truly fears though? Reluctant individual citizens are powerless to derail social solidarity. Only data aggregators have enough clout to do so. Let promptly prosecute those who dare as pirates, with no real property rights to patient data, only custodial duties.

And so, by elimination, we get to the most profound objection presented by Marc Rodwin. As Vasant Dhar and Arun Sundararajan sum it up, "asserting property rights and notional ownership over exchanged data is a theoretical ideal that has onerous transaction costs" (*******).

Marc Rodwin and Jane Yakowitz do research. Can we look together at how to lower such high transaction costs? Next week, I will bring a map.

Philippe Coueignoux

June 2011
