Box-Office Algorithm Predicts Revenue From Movie Screenplay

By Alexis C. Madrigal

It seems like every summer, I go to see at least one horrible movie, which forces me to question the sanity of Hollywood. "How could they have thought that was going to do well?" Well, some business scholars want to take the guesswork out of the movie business. They're working on an algorithm that can (prest-o, change-o!) distill a screenplay's text into a box office tally.

Imagine a world where Hollywood producers could predict, with scientific precision, the box office revenue a movie will generate just by reading the screenplay. A new forecasting model devised by a trio of marketing professors from Wharton and NYU promises to deliver something like that. Among their findings: action movies with multidimensional conflicts are the most surefire investments, and horror films the riskiest.

Read the full story at Freakonomics at The New York Times.

The full paper is fascinating [pdf], mostly for the factoids. For example, they use a common natural language processing method called "bag-of-words." Here were the 30 most common words (all forms included) in their dataset of movie scripts. The f-word, man, dad, mom: those I can understand. But how about "corridor"? Then start thinking of all the movies in which someone walks/runs/fights down a corridor. (So many!) Note "chamber" and "tunnel" as well. It's like this study discovered a hidden truth about the way Hollywood architecture has to work.

30word.jpg

This article available online at:

http://www.theatlantic.com/technology/archive/2010/08/box-office-algorithm-predicts-revenue-from-movie-screenplay/61844/