Imagine a world where Hollywood producers could predict, with scientific precision, the box office revenue a movie will generate just by reading the screenplay. A new forecasting model devised by a trio of marketing professors from Wharton and NYU promises to deliver something like that. Among their findings: action movies with multidimensional conflicts are the most surefire investments, and horror films the riskiest.Read the full story at Freakonomics at The New York Times.
The full paper is fascinating [pdf], mostly for the factoids. For example, they use a common natural language processing method called "bag-of-words." Here were the 30 most common words (all forms included) in their dataset of movie scripts. The f-word, man, dad, mom: those I can understand. But how about "corridor"? Then start thinking of all the movies in which someone walks/runs/fights down a corridor. (So many!) Note "chamber" and "tunnel" as well. It's like this study discovered a hidden truth about the way Hollywood architecture has to work.

This article available online at:
http://www.theatlantic.com/technology/archive/2010/08/box-office-algorithm-predicts-revenue-from-movie-screenplay/61844/