May 31, 2011

"Paris rolled out the red carpet for the titans of Silicon Valley", writes Tim Bradshaw (*). All the victims of their past indiscretions conveniently silent, these fortunate fellows were officially feted by the French President together with other luminaries of the Internet. They did talk of data rights but far from debating about personal privacy for you and me, their focus was copyrights, the imperiled privilege of the few.

Give Tim Bradshaw his due, pointedly concluding with Lawrence Lessig's indictment. "The future of the internet [was] not here, it was not invited". "Are you seconding this, you ask, because you did not make President Sarkozy's cut?" Touché but cheap. There is more at stake than my vanity.

If one limited oneself to copyright management, one should still be well advised to mull the meaning of John Gapper's disabused column about the same event (**). Stressing "the truth, inevitably, lies in the middle", he finds both sides "absurbly Manichean". Is he planning to take Philip Stephens' beat and start reporting on the prospect for peace in Palestine? There are certainly idealists on both sides he notes, "but not everyone is as pure".

The trouble is, even purity itself is so easy a virtue to preempt and pervert.

Take Brooklyn Law School professor Jane Yakowitz and her concept of a personal "data commons". Unable today to access her forthcoming paper on "Privacy and the Public Interest", I rely on Leon Neyfakh's essay (***) to find "she proposes granting legal immunity to any entity that releases data into the commons [...] under the condition that they follow a set of strictly enforced standards for anonymization".

Jane Yakowitz lauds the merit of armor while Latanya Sweeney, Paul Ohm and Vitaly Shmatikov proclaim the power of the guns. When I root for the latter , when I push for privacy beyond mere anonymization, I acknowledge this furthers my interests. Mirroring me, Jane Yakowitz recognizes her research suffered when "individual graduates [...] objected to have their [personal] information included in her analysis". This is honest purity.

But purity ought to be consistent too. If public data commons are good, why Jane Yakowitz doesn't share her draft with us today? Is it because intellectual creation is the only personal production which deserves the privilege of privacy, far above the humble performance art of daily living?

"Privacy advocates are so locked into their own ideological viewpoint", says Heinz College professor emeritus George T. Duncan, "they fail to appreciate the value of the data". I beg to differ. "Eprivacy is all about money", I wrote three years ago. What I see however are scholars as keen as any to protect their own research data from colleagues competing for recognition and to charge for teaching, contrary to early medieval ideals.

The truth is personal data rights and copyrights are the two sides of the same fight. The only difference is that Google and Facebook pirate personal data with impunity while President Sarkozy punishes those who share popular songs. Police both sides under similarly fair laws or not at all.

Purity also invites its devotees to make values sacred. Beware that, in human hands, these tend to turn into dangerous idols. Make freedom of speech an absolute good and you may end up under a tyranny. The same warning holds for private property, including copyrights and personal data rights. Equally Manichean is the opposite push to sanctify sharing beyond its own merits. Like truth, virtue often resides in the middle.

Finally being whiter than white may close the mind to arguments which threaten one's opinions. Jane Yakowitz seems to have joined the cult of Big Data, as recently proclaimed by McKinsey. But read Eli Pariser (****) on the consequences of what amounts to absolute centralization.

In regard to intellectual property, John Kay calls for "a single digital archive with easy universal access" as "a publicly sponsored project" (*****). As he wisely let others "set the terms and conditions" for using it, his proposal is non controversial. It also highlights the link between storing and access. For Eli Pariser, access is where the rub is. "Increasingly, and nearly invisibly, our searches for information are being personalized too".

Creepy, for "[the gatekeepers are not] people, they're code". Lawrence Lessig and Jonathan Zittrain have already warned us. Code is different, at once more efficient and more ruthless. For Douglas Rushkoff, the power flows to those who control the code which channels our social interactions. Worse, code is both inevitably fallible and a ready excuse for its masters to shrug off all responsibilities. Machines, lowly programmers be blamed!

Most damning of all, the very reason for Big Data, i.e. centralization is the only way to enable the emergence of information, remains to be proven. Recall Big Data has always been with us. What else is Nature? Yet its very bigness is an obstacle to scientific progress. Singular facts observed with open curiosity may sprout into new theories as when Becquerel stumbled onto radioactivity (1). More often large amounts of data are systematically processed to test scientific hypotheses. But is there one example of unguided, exhaustive accounting of raw data leading to a real discovery?

Similarly, research in human sciences would not be impaired, on the contrary, if it were to be carried on real volunteers, paid or not according to the case. Would it be impossible to find a few to agree to share their personal data in great depth? On a larger scale and for the common good, who would refuse to download a well formed question and let it execute against one's profile on one's very own confidential environment as long as the researcher could only tally the answers? Instead of forced sharing, genuine participation frees privacy from frivolous fishing expeditions.

The first American soldiers to die in battle did so in the name of "no taxation without representation". Today's populists would rather or so it seems have no taxation at all, while Big Data partisans wants no representation at all, lest too dumb for our own good, we refuse to freely consent to the appropriation of our own data without just compensation. Contrast this with the Creative Commons license. When I volunteer property, I mean it.

If the future of Internet had been invited in Paris then, the guests would have talked of how to avoid centralizing data and its access. They would have looked at how to deploy mechanisms which can effectively deal with decentralized data. They would have work out the tools with which to enable a healthy diversity of human recommenders to help us access the data we create collectively, one individual creator at a time.

Outstanding models are not wanting. What about the Internet itself, and, as Berners-Lee would point out, the World Wide Web? For personal data, I contend that ePrio is of the same kind. Much remains to be done. But perhaps the future of the Internet (2) is a thing of the past.

Were Internet to be financed all over again, would today's data titans share their power and profit besides our personal profiles? Pure speculation.

Philippe Coueignoux

May 2011
