The consequences of failing to measure the impact of so many of our government programs—and of sometimes ignoring the data even when we do measure them—go well beyond wasting scarce tax dollars. Every time a young person participates in a program that doesn’t work but could have participated in one that does, that represents a human cost. And failing to do any good is by no means the worst sin possible: some state and federal programs actually harm the people who participate in them.
You’ve surely heard of Scared Straight, a program started in a New Jersey prison in the 1970s that brings at-risk youth to meet with hardened inmates who tell them about the harsh realities of life behind bars. The program has gotten an extra dose of attention lately because of the A&E reality TV show Beyond Scared Straight, which takes viewers inside similar (and generally more harrowing) prison programs for young people in different states, blue and red, across the country.
It turns out that Scared Straight–style programs are actually pretty effective—at increasing criminal behavior. Rigorous research conducted by Anthony Petrosino and researchers at the Campbell Collaboration shows that instead of scaring kids and turning them away from risky, criminal behavior, the programs do just the opposite: they make the kids about 12 percent more likely to commit a crime.
Fortunately, the Department of Justice is acting on these findings and warning state governments to stop funding Scared Straight and similar programs. But Scared Straight is not the only government program that’s been shown to cause harm. The federal government’s long-running after-school program, 21st Century Community Learning Centers, has shown no effect on academic outcomes on elementary-school students—and significant increases in school suspensions and incidents requiring other forms of discipline. The Bush administration attempted to reduce funding for the program. But following impassioned testimony on behalf of the program by Arnold Schwarzenegger, then a potential candidate for governor of California, congressional appropriators agreed to restore all funding. Today the program still gets more than $1 billion a year in federal funds.
What can we do to promote moneyball in government? The first (and easiest) step is simply collecting more information on what works and what doesn’t.
The Obama administration has already pushed federal agencies to bolster their analytic capabilities and to show how their funding priorities are evidence-based, particularly in their budget submissions. As a result, the administration’s 2014 budget proposal had an unprecedented focus on evidence and results.
A nonprofit organization that advocates for evidence-based decision making, called Results for America, has proposed a number of measures that would expand on these efforts. It is calling for reserving 1 percent of program spending for evaluation: for every $99 we spend on a program to improve education, reduce crime, or bolster health, we would spend $1 making sure the program actually works.
The Harvard economist Jeffrey Liebman has written that, based on his simple but convincing calculations, “spending a few hundred million dollars more a year on evaluations could save tens of billions of dollars by teaching us which programs work and generating lessons to improve programs that don’t.” Who wouldn’t want a 100-fold return on investment?
The more evidence we have, the stronger it is; and the more systematically it is presented, the harder it will be for lawmakers to ignore. Still, linking evaluation to program funding will be tough, as both of us have seen in practice, again and again.
One thing that is essential to a more results-driven government is holding politicians accountable for their support of failing programs. Interest groups regularly rate politicians on their adherence to a particular perspective. What if we had a Moneyball Index, easily accessible to voters and the media, that rated each member of Congress on their votes to fund programs that have been shown not to work?
Even absent such public shaming, the government is taking steps in the right direction. The Department of Education’s Investing in Innovation (i3) program for improving student achievement and educator effectiveness, for instance, gives priority to projects backed by rigorous evidence of success, while still allocating a portion of its funds for promising programs willing to build evidence over time. The program originated in the rush and jumble of the Recovery Act, so it bypassed some typical congressional hurdles. But the performance mandate now built into i3’s design provides a model for how the federal government can make decisions about programs based on impact. Liebman has put forward some good ideas about how to expand upon that model. He suggests that, to start, 5 percent of the dedicated funding that’s delivered each year by the federal government to state and local governments—which includes major programs like the Community Development Block Grant and the Community Mental Health Services Block Grant—be reserved for programs that have demonstrated their worth. That share could rise over time as the evidence base expands.
New York City Mayor Michael Bloomberg is taking another promising approach, essentially creating probationary programs that must prove themselves to become permanent. The city’s Center for Economic Opportunity seeks out new, innovative programs with potential to combat the poverty cycle, and then oversees rigorous evaluations “to determine their effectiveness in reducing poverty, encouraging savings, and empowering low-income workers to advance in their careers.” The programs that produce the strongest results become eligible for further city funding; if a program isn’t having the intended effect, dollars are shifted to those things that work. This approach has now spread to seven other urban areas, with the help of the Obama administration’s performance-based Social Innovation Fund.
How can we steer dollars away from well-established programs that aren’t working? The U.S. Department of Health and Human Services has shown one nuanced approach. In 2011, the Obama administration built on the Bush administration’s attempt to examine how children were faring in individual Head Start programs across the country, and began a crackdown on providers failing the kids they were supposed to serve. Instead of threatening to scrap Head Start altogether, the agency refused to renew funding for the bottom 10 percent of local programs—132 in all. These centers were told that they had failed to meet quality standards. To requalify for Head Start funding, they would have to make substantive improvements and then compete for funds against other providers in their area. Autopilot funding for lousy centers came to an end.
Another encouraging data point: two years ago, when fiscal pressures really began to mount at the federal level, Congress finally pulled the funding for the ineffective Even Start literacy program. Over time, the data won out. “Under the gun, Congress can do the right thing,” says Robert Gordon, who worked in the Office of Management and Budget under President Obama. “Now there’s no money to waste, so interest-group politics and bogus arguments don’t carry as much weight as they used to. There’s reason for optimism.”
We’re optimistic too, even though the obstacles to moneyball in government are daunting. Absent major changes in campaign finance, special interests that profit from blind budgeting will still have a powerful means of thwarting reform. Agencies’ staff will roll their eyes at the next round of “budget reforms,” wait out the incumbent, and then continue business as usual. And members of Congress will stay wedded to their legacy programs.
But we believe the federal budget crunch will force change. Already, many cities have had to choose between fewer cops and fewer teachers; between slower ambulance response and less-frequent garbage removal. The federal government is now beginning to face similarly stark choices. Do we really want to furlough hundreds of FBI agents at a time of heightened threats? Or lay off air-traffic controllers? Do we really want big cuts at the National Institutes of Health or to early-childhood-education investments, both of which are engines of economic growth? Do we really want to eat our seed corn?
Both parties have signed up to reduce nondefense discretionary spending—that is, the money for everything from the Food and Drug Administration to the NIH to the Veterans Administration—to levels that will be at least $350 billion lower through 2022 than they were in 2012. That would bring discretionary spending as a share of the economy to its lowest level on record (data go back to 1962). We should not do this blindly.
Moneyball doesn’t happen overnight. Many of the “sabermetrics” practices that transformed baseball a decade ago can actually be traced back to the Brooklyn Dodgers general manager Branch Rickey, who is universally known for breaking the color barrier in baseball but mostly unacknowledged for being the first GM to hire a professional statistician. Rickey’s approach got little traction for nearly half a century, until the pivotal 2002 season, when the Oakland Athletics’ general manager, Billy Beane, built his club on analysts’ data rather than scouts’ beliefs.
What prompted Beane’s gutsy decision to revive the data-centered approach? With $40 million to spend on players in 2002, the A’s had to compete on a comically uneven playing field with big-market teams like the Yankees, which spent $125 million on its roster that same year. In other words, scarcity drove Beane’s break from established tradition. We hope and expect it will have the same effect on Washington in the years to come.