Can a Computer Tell Us What Makes Paris Look Like Paris?

Blue and green street signs, tall double-paned windows, balconies enclosed with iron filigree, and a distinct lamppost style: the keys to Parisian charm, as calculated by an algorithm.

Google Street View/Carl Doersch et al.

When the directors of Ratatouille set out to create the look and feel of Paris in computer-generated art, they faced the same question that faces any artist tasked with capturing any particular place: What is it -- visually -- that makes this place this place? As co-drector Jan Pikava explained in the book The Art of Ratatouille, "The basic question for us was: 'what would Paris look like as a model of Paris?', that is, what are the main things that give the city its unique look?" To find out, his team had to (pdf) "run around Paris for a week like mad tourists, just looking at things, talking about them, and taking lots of pictures." The results looked very, well, Parisian, just as the artists had hoped.


What was it about Paris they had homed in on? What is it about any city that gives it its distinct look?

Computer scientists at Carnegie Mellon University and the Ecole Normale Superieure in Paris have built an algorithm that uses images pulled from Google's Street View to do much what Pixar's artists did: Find the small details that appear frequently in Paris and -- crucially -- do not appear in other cities. In other words: You can't evoke Paris with just the Eiffel Tower and the Arc de Triomphe. You need to find the distinct visual cues that emerge block after block, street after street. In Paris, the algorithm ferreted out the city's blue and green street signs, tall double-paned windows, balconies enclosed by iron filigree, and, as Pixar captures above, a particular lamppost style. Paris's je ne sais quoi, is, to the contrary, quite knowable after all -- discoverable by both artist and algorithm.

The researchers explain how it all works in the video below:

In a paper outlining the research, the team, led by Carl Doersch of Carnegie Mellon, says that of the 12 cities they ran through the algorithm (Paris, London, Prague, Barcelona, Milan, New York, Boston, Philadelphia, San Francisco, Sao Paulo, Mexico City, and Tokyo), the American cities were the toughest nuts to crack. They write, "It is also interesting to note that, on the whole, the algortihm had more trouble with American cities; it was able to discover only a few geo-informative elements, and some of them turned out to be different brands of cars, road tunnels, etc." They hypothesize that this could stem from a "relative lack of stulistic coherence and uniqueness in American cities (with its melting pot of styles and influences), as well as the supreme reign of the automobile on American street."

To demonstrate the value of their findings, the researchers asked artists to sketch a Paris street scene based on their idea of what Paris looks like. They then shared the algorithm's results with the artists and asked them to draw the scene again. An "informal survey" found  a much more Parisian flavor in the second set.


From the street, to Street View, to computer code: With a certain finessing we can take aesthetic cues that surround us and massage them into clues that an algorithm can see. We don't tend to think of windows, lampposts, or fences as data, but that's just what they are -- they just need to be in the right format for the right processor, human or machine.