The new Foundations for Evidence-Based Policymaking Act requires federal agencies to spell out which questions they’re trying to answer and then to collect data systematically. But if called upon to help with this research, outside experts will need to do their part, too, by putting themselves in the shoes of the public officials who have to make the best decisions they can with the information they have.
As experimental economists, we are delighted to see randomized controlled trials being used to evaluate social policies. But as we documented in a recent working paper, the typical path from research to policy leaves much to be desired. Upon observing that a program has a statistically significant effect—often, over a short period, and within a small sample unrepresentative of the population—researchers might recommend that policy makers adopt the program across an entire city, state, or country. Unfortunately, this approach does not guarantee that the initial results are reproducible, persist over time, or can be scaled to a larger population. Important questions go unanswered: If we were to run the same trial again, would we observe the same effect? If we were to repeat the experiment in a different context or population, would we expect the findings to persist?
We, too, understand the temptation to generalize from preliminary results that seem really compelling. A decade ago, one of us, John List, working with other scholars at the University of Chicago and Harvard University, launched the Chicago Heights Early Childhood Center, one of the most comprehensive longitudinal early-childhood studies ever conducted. Preliminary results suggest that a program that includes financial rewards for young children and their families for good attendance and other important behaviors has been successful in increasing children’s test scores. However, the impact of these interventions on academic performance may be short-lived. As such, we need to wait patiently for medium- and long-term results (and measure additional outcomes besides test scores) before we can convincingly recommend our program to policy makers.
Jumping the gun can have substantial real-world consequences. Nothing demonstrates the need for long-term follow-ups better than the Moving to Opportunity program, in which families from impoverished neighborhoods got the chance to move to better-off areas. The program was a success, but that wasn’t immediately evident. The substantial returns weren’t fully clear until participating children reached adulthood.
Academia, however, makes little provision for tracking the effects of a program over the very long term. Even tenured scholars face the pressure to publish. Journals in economics value novel, surprising, and positive findings. Yet they do not strongly encourage replications or follow-up studies. Meanwhile, many research grants are too small to allow researchers to collect large samples or long-term data. As a result, economists, other social scientists, and even medical researchers are encouraged to design experiments that barely meet minimum standards for effective testing. Relatively brief trials with small samples that yield a “wow” result tend to be the desired outcome, ensuring a well-published journal article and a boost to researchers’ careers.