Could human and machine forecasters work together to increase the intelligence agencies' foresight?
We would like to know what the future is going to be like, so we can prepare for it. I'm not talking about building a time machine to secure the winning Powerball number ahead of time, but rather creating more accurate forecasts about what is likely to happen. Supposedly, this is what pundits and analysts do. They're supposed to be good at commenting on whether Greece will leave the Eurozone by 2014 or whether North Korea will fire missiles during the year or whether Barack Obama will win reelection.
A body of research, however, conducted and synthesized by the University of Pennsylvania's Philip Tetlock finds that people, not just pundits but definitely pundits, are not very good at predicting future events. The book he wrote on the topic, Expert Political Judgment: How Good Is It? How Can We Know?, is a touchstone for all the work that people like Nate Silver and Princeton's Sam Wang did tracking the last election.
But aside from the electorate, who else might benefit from enhanced foresight? Perhaps the people tasked with gathering information about threats in the world.
You probably have never heard of IARPA, but it's the wild R&D wing of our nation's intelligence services. Much like the Defense Advanced Research Projects Agency, which looks into the future of warfare for the Department of Defense, the Intelligence Advanced Research Projects Activity looks at the future of analyzing information, spying, surveillance, and the like for the CIA, FBI, and NSA.
We wrote in-depth about a project they're running to better understand metaphors (yes, metaphors), and, now, one of their projects is to apply Tetlock's insights into expert judgment. In particular, while Tetlock found that most analysts were terrible, some were better than others, particularly those he called foxes, who were more circumspect in their pronouncements and less wedded to a hard-and-fast worldview. The work suggested that it might be possible to improve people's judgments about the future.
His work matched up perfectly with a call for proposals that IARPA put out two years ago for a new program called ACE, Aggregative Contingent Estimation. They wanted researchers to "develop and test tools to provide accurate, timely, and continuous probabilistic forecasts and early warning of global events, by aggregating the judgments of many widely dispersed analysts." Well, Tetlock thought, perhaps I can apply my research to this problem.
So, after his proposal was selected, IARPA paid for he and his team to recruit 3,000 volunteers, who each agreed to participate in forecasting tournaments that asked them to make specific, testable predictions about the future and then provided them feedback. They are competing against four other teams who were also funded by IARPA to see who can forecast the best. Just within the year and a half that the research study has been running, Tetlock found that people could better, much better, at making predictions than he thought possible.
Tetlock discussed the work in an excellent interview with Edge.org last week. Here's how he described it:
Is world politics like a poker game? This is what, in a sense, we are exploring in the IARPA forecasting tournament. You can make a good case that history is different and it poses unique challenges. This is an empirical question of whether people can learn to become better at these types of tasks. We now have a significant amount of evidence on this, and the evidence is that people can learn to become better. It's a slow process. It requires a lot of hard work, but some of our forecasters have really risen to the challenge in a remarkable way and are generating forecasts that are far more accurate than I would have ever supposed possible from past research in this area.
But here's the really fascinating thing from a technological perspective. Tetlock's work has shown in the past that computers tend to be better forecasters than humans, with some key exceptions. But what about the combination of humans and machines? What if cyborg forecasting is the way to go? That's precisely what Tetlock suggests in the Edge interview, although he hedges it carefully (like the fox that he is).
We don't have geopolitical algorithms that we're comparing our forecasters to, but we're turning our forecasters into algorithms and those algorithms are outperforming the individual forecasters by substantial margins. There's another thing you can do though and it's more the wave of the future. And that is, you can go beyond human versus machine or human versus algorithm comparison or Kasparov versus Deep Blue (the famous chess competition) and ask, how well could Kasparov play chess if Deep Blue were advising him? What would the quality of chess be there? Would Kasparov and Deep Blue have an FIDE chess rating of 3,500 as opposed to Kasparov's rating of, say, 2,800 and the machines rating of, say, 2,900? That is a new and interesting frontier for work and it's one we're experimenting with.
In our tournament, we've skimmed off the very best forecasters in the first year, the top two percent. We call them "super forecasters." They're working together in five teams of 12 each and they're doing very impressive work. We're experimentally manipulating their access to the algorithms as well. They get to see what the algorithms look like, as well as their own predictions. The question is-do they do better when they know what the algorithms are or do they do worse?
There are different schools of thought in psychology about this and I have some very respected colleagues who disagree with me on it. My initial hunch was that they might be able to do better.
It seems to be that IARPA might be happy with this work, but it remains to be seen whether it will actually get applied within the intelligence agencies. (Anyone seen Homeland? Saul is a total fox who threatens the hierarchy.)
Tetlock is convinced that his work has the potential to be dangerously destabilizing to bureaucracies of all types. Excitingly (at least to some of us), he sees these kinds of tools spreading throughout different kinds of organizations, which could have wide-ranging impacts on the existing hierarchies. From the nation's far-out spy researcher to your local middle manager, get ready for forecasting tournaments.
The long and the short of the story is that it's very hard for professionals and executives to maintain their status if they can't maintain a certain mystique about their judgment. If they lose that mystique about their judgment, that's profoundly threatening. My inner sociologist says to me that when a good idea comes up against entrenched interests, the good idea typically fails. But this is going to be a hard thing to suppress. Level playing field forecasting tournaments are going to spread. They're going to proliferate. They're fun. They're informative. They're useful in both the private and public sector. There's going to be a movement in that direction. How it all sorts out is interesting. To what extent is it going to destabilize the existing pundit hierarchy? To what extent is it going to destabilize who the big shots are within organizations?