Philippe's Fillips

December 2, 2008

Like the best gardens, some topics never lose their capacity to reveal new perspectives to the discriminating eye. Last week led me closer to the truth. Today I find myself compelled to revisit one of my favorite subjects, recommendations (1).

This new enquiry was prompted by an elegiac column written by Luke Johnson on the passing of "traditional small ads" and the newspapers they supported, done in by "new digital facilities" (*). While it shows even Anglo-Saxon capitalists (2) can have feelings, the point is whether, as asserted, "the random serendipity of old-fashioned print classifieds is rapidly disappearing" and, as deplored, "we are [thus] losing something special".

Take Luke Johnson's first example. In want of disk jockey experience, he finds a help wanted ad for such a position. If you know what you want, I fail to see the charm of serendipity. Modern search engines will indeed get it for you faster than any half-hearted perusal of inches in print.

His second example is no more convincing. For sale, a company looks for an entrepreneur with access to private equity. Call this reverse search advertising if you like, the lesson stays the same. Targeted advertising on appropriate social networks should work better than old-fashioned media.

The third example however opens an attractive vista. Suppose you want to unload "a derelict chapel" in rural Somerset. Listing it on Craigslist may not be such a great idea as prospects there are unlikely to type "chapel" as a keyword. Yet what online behavior would give your advertising dollar more of a chance than the UK National Lottery? Repeat searches for Hubert Robert (3)? Perhaps, but I would not bet on it.

If serendipity is the answer, forget about two strangers bumping into each other at random. Some intermediation is required. What classified ads have traditionally enabled is still at heart what I call a recommendation mechanism. So far I have looked at how such schemes fulfill their social obligations, stressing the responsibility they ought to bear despite a general reluctance to do so. It is now time to question their efficiency.

Obligingly, Clive Thompson provides us with an in-depth review of the Netflix contest (**). Now entering its third year, this challenge has the great advantage to expose the mechanics behind the mechanism. In the instance "most of the leading teams [...] use singular value decomposition" (4). In other words while consumers' movie preferences seem to be all over the map, one can describe both consumers and movies according to a finite set of factors linked by simple relations. Unidimensional stereotypes such as "men prefer action movies and women romantic comedies" give the idea, however crude.

Unfortunately "while the teams are producing ever-more-accurate recommendations, they cannot precisely explain how they're doing this". That is troubling. The power of stereotypes lies precisely in their obviousness and at least their victims can understand what's happening and protest. On what basis can one appeal a black box decision?

If some mechanisms seem to know unknowable things about consumers, others know too many facts. Brad Stone details how data brokers do not hesitate to help banks target troubled consumers (***). "Data Warehouse charges banks $499 for 2,500 names of subprime borrowers who have fallen into debt and need to refinance", i.e. twenty cents per recommendation to market "needy, overwhelmed consumers [...] offers [...] too good to be true". InfoUSA was reported to do the same for elderly people. The slightly slimy business model of such outstanding corporate citizens, who donate millions to political campaigns, predates Internet. Do not be fooled by Luke Johnson's nostalgia of some pre-Internet Golden Age.

And yet most mechanisms seem to come short of knowing details which can make or break a recommendation. In Michael Schrage's words (****), "these algorithms [...] are also subject to sudden bouts of apparent blindness", confusing for instance your own purchases with the gifts you buy for your friends. Good recommendations need context. No more so than a personalized advertisement which, despite being focused on the user's profile, cannot forget about "content adjacency" lest it pitch beef steaks next to a report on how a bad diet can cause cardiovascular diseases.

For all these reasons, it is not an easy task to build a recommendation mechanism which proves to be both ethical and efficient. But would serendipity be more efficient, even if it successfully evaded all responsibilities under cover of randomness?

Writing on online documentary research by academics (*****), Rebecca Tuhus-Dubrow gives a much more positive argument in favor of randomness. In a paper published "in July in the journal Science", James Evans warns us that "the Internet's influence is to tighten consensus", with a "narrowing" effect on scholarship. Last year we followed John Kay's lead in denouncing "scientific consensus" as a devilish chimera. Yet most recommendation mechanisms are based on popularity, explicit or not, and academic referencing is no exception. Giving preference to the most recent research is no remedy as it merely converts consensus into fashion.

When it comes to the apparel and entertainment industries, some may object that fashion, the herd instinct recommendation mechanisms so encourage, is not necessarily a bad thing. More puzzling is their opposite failure, unpredictability. As Clive Thompson tells us apropos Netflix, "a small group of mainly independent movies represents more than half of the [...] errors". Better a video-store clerk's serendipitous advice. A computer cannot begin to guess who would like such a movie, anymore than it can target who would love to live in a derelict rural chapel.

Have we then come full circle? Are Luke Johnson's regrets justified? Must serendipity stand for the failure of all online recommendations?

Michael Schrage helps us see our way through this labyrinth by opposing comparison to browsing. Existing recommendation mechanisms, which work by ranking all possible choices, are but comparison tools for users with a goal in mind. Needed to implement browsing online is a different species of recommendation mechanisms. Browsers and hyperlinks are not enough. As Rebecca Tuhus-Dubrow quotes Robert Berring, "if you get an index, a table of contents, you see the environment". In other words, browsing is about creating a complex context to satisfy unfocused curiosity.

I would not take classified ads as a model but I agree with Luke Johnson on this. Today's Internet sorely lacks the quality which sustains browsing, tasteful variety. Variety we do have aplenty, but we are lost in a jungle we can only escape by submitting to the dreadful conformity of "the top ten". The main page of Wikipedia is a good example (5). If we want to browse rather than search, proposing "A random article" is to debase serendipity. Other rubrics, such as "Did you know..." and "On this day..." are better but cannot its context rise above a collection of anecdotes? (6)

To complement search with true serendipity, combine I recommend, the trained talent of the librarian with that of a landscape architect (7).

Philippe Coueignoux

(*) ......... Wanted: a return to the thrill of small ads, by Luke Johnson (Financial Times) - November 19, 2008

(**) ....... If You Liked This, You're Sure to Love That, by Clive Thompson (New-York Times Magazine) - November 23, 2008

(***) ..... Drawing a Bead On Debtors, by Brad Stone (New-York Times) - October 22, 2008

(****) ... Recommendation Nation, by Michael Schrage (Technology Review) - May/June, 2008

(*****) . Group Think, by Rebecca Tuhus-Dubrow (Boston Globe) - November 23, 2008

(1) see "recommendation mechanisms" in the list of Major Themes of these fillips.

(2) see Anglo-Saxon capitalism in the Wikipedia

(3) see Hubert Robert in the Wikipedia

(4) see singular value decomposition in the Wikipedia

(5) see the main page in the English Wikipedia

(6) added 06/26/09: to do justice to the Wikipedia, the abundance of inner links within its pages strongly support browsing, especially when the page is designed as an overall introduction to a body of knowledge (e.g. the Solar System)

(7) this happens to be the profile of Rafael Tarrago, librarian at the University of Minnesota Wilson Library

December 2008

Copyright © 2008 ePrio Inc. All rights reserved.