A look at new software that could transform journalism
In a few short years, we've learned to delegate all manner of tasks to computers. For music recommendations or driving directions or academic scouring, we readily turn to our clever machines. They do it better most of the time, and with much less effort.
Now computers have proven competence—no, fluency—in yet another aspect of human life: writing. Narrative Science, a Chicago-based startup, has developed an innovative platform that writes reported articles in eerily humanlike cadence. Their early work focused on niche markets, clients with repetitive storylines and loads of numeric data—sports stories, say, or financial reports. But the underlying logic that drives the process—scan a data set, detect significance, and tell a story based on facts—is powerful and vastly applicable. Wherever there is data, Narrative Science founders say, their software can generate a prose analysis that's robust, reliable, and readable.
For example: One high-profile client, Forbes magazine, uses the platform to create what Forbes writer Lewis Dvorkin calls "computer-generated company earnings previews." Each day, the platform sorts through recent stock data to profile a notably performing company. Another client is The Big Ten Network, which uses Narrative Science to create automatic sports recaps based on box scores and player data. Though these pieces lack the verve of, say, Chuck Klosterman's sportswriting, the highly customizable platform does adopt a sports fan's idiomatic shorthand: "Cincinnati was hot from long range," one Narrative Science recap runs, "hitting 9-of-23 threes for a 39 percent night from beyond the arc." Similarly, the iPhone app Gamechanger, which coaches and parents use to score Little League games, has a "recap" service enabled by Narrative Science. Mark the final out and, kapow, you've got a print-ready article about the game. In theory, you could even receive recaps with a personal touch, nine innings retooled around the feats and foibles of your little tyke.
I traveled to Chicago to meet the Narrative Science founders and learn more about their work. They claim their technology will reshape our relationship to data, media, and the way we consume information—and, after several hours of interviews, I believe them. The concern in some quarters is that Narrative Science, with its ability to generate reams of cheap, instantaneous content—is going to make human writers obsolete. The truth, however, is more complicated.
RISE OF THE META-JOURNALIST
MORE ON BOOKS
Every startup has its rosy vision for the world. Mark Zuckerberg wants to make people more connected. Sergey Brin wants to make great content more findable. Kris Hammond, Narrative Science co-founder and CTO, wants to make things easier to read.
"Data is tremendously valuable," he told me. "It's unbelievably valuable. But it's not valuable as a spreadsheet of numbers. It's valuable based on the insights that you can glean from it." We're swimming in numeric data, he insists, almost drowning in it—which strikes him as odd because most people don't actually like numbers very much. Spreadsheets confound us because human beings think in stories. So, in Hammond's view, wherever there are numbers, we should have stories instead--and that's where Narrative Science comes in. "In the long run," he said, "our technology ends up being the mediator between data and the human experience."
When I ask him what this means for human writers, he points out that his work has long been a collaboration between computer scientists and journalists. In his ongoing work at Northwestern's Intelligent Information Laboratory, which he co-chairs with Narrative Science Chief Scientific Advisor Larry Birnbaum, he routinely partners with students and faculty at the University's Medill School of Journalism to create from "cross-functional teams" of writers and coders. (This itself is a pioneering move, as journalists and computer scientists tend not to cross paths in scholarship or public life.) In fact, it was this dynamic interplay that lead to Stats Monkey, a baseball recap platform that became the prototype for today's authoring platform.
Birnbaum and Hammond, both Yale-educated professors of computer science, have academic backgrounds in linguistic systems—and their serious interest in the science of story arc is apparent at Narrative Science. Here, because they each contribute such valuable work, writers and coders inhabit the same hierarchical plane. Programmers are crucial because they maintain and improve the robust authoring platform that is the company's foundation. This foundation is enormously powerful. "We've created a horizontal platform that's vertically agnostic, industry agnostic," CEO Stuart Frankel told me. "We can write just about any kind of content, using any kind of data." But each client not only has different rules—house style, publication tone, specialized vocabulary. They also tell different kinds of stories. That's why Narrative Science needs journalists.
When Narrative Science inks a deal with a new client, their writers begin work customizing the existing platform within a configuration layer. House style—how to format names and dates, when to italicize, and so on—is the easy part. What takes more time is establishing the facts and inferences that will conceivably be drawn from client data, as well as a "constellation" of possible story angles through which the data might be presented. In the case of baseball, this means "all the scenarios that might be derived from the raw data of a box score": the slugfest, the shutout, the pitcher's duel, the back-and-forth, postponed by rain, on and on.
In this way, Narrative Science writers don't think about specific stories as much as they outline a web of story possibilities. "They know how to configure our technology to allow them to become what are essentially meta-journalists," Frankel told me, "people who can write millions of stories opposed to a single story at a time." As the technology progresses, we may see more and more writers working on this macro level.
WE RECOIL FROM THE DULL
But using Narrative Science to write baseball games is a little like hammering a nail with an atom bomb. The platform's inference engine, Hammond says, is supported by "hardcore data analytics"—it can handle vast, truly complex information, data sets that would boggle any human mind. In this regard, the platform may one day serve as a kind of all-star assistant for human journalists.
Imagine, for instance, the prospect of deducing how Twitter users feel about the Republican presidential candidates on a particular day. A human journalist simply couldn't do it—trying to monitor any significant sample size would be impossible. Twitter moves so fast, and at such a high volume, that it eludes us. The problem with social media," Hammond writes on his blog, "is that there's so damned much of it."
But Narrative Science is beta-testing an initiative that can monitor all of Twitter for trends in content, using the Republican contenders (for now) as their frame. "Newt Gingrich has been consistently popular on Twitter, as he has been the top riser on the site for the last four days," the platform reported in February. "While the overall tone of the Gingrich tweets is positive, public opinion regarding the candidate and character issues is trending negatively. In particular, @MommaVickers says, 'Someone needs to put The Blood Arm's 'Suspicious Character' to a photo montage of Newt Gingrich. #pimp.'" Sure, it's a little dry. But this kind of holistic perspective, in the future, will be useful for human writers (not to mention advertisers) as we try to wrangle social media sandstorms into something we can hold.