No one is entirely clear on how Brian Nosek pulled it off, including Nosek himself. Over the last three years, the psychologist from the University of Virginia persuaded some 270 of his peers to channel their free time into repeating 100 published psychological experiments to see if they could get the same results a second time around. There would be no glory, no empirical eurekas, no breaking of fresh ground. Instead, this initiative—the Reproducibility Project—would be the first big systematic attempt to answer questions that have been vexing psychologists for years, if not decades. What proportion of results in their field are reliable?
A few signs hinted that the reliable proportion might be unnervingly small. Psychology has been recently rocked by several high-profile controversies, including: the publication of studies that documented impossible effects like precognition, failures to replicate the results of classic textbook experiments, and some prominent cases of outright fraud.
The causes of such problems have been well-documented. Like many sciences, psychology suffers from publication bias, where journals tend to only publish positive results (that is, those that confirm the researchers’ hypothesis), and negative results are left to linger in file drawers. On top of that, several questionable practices have become common, even accepted. A researcher might, for example, check to see if they had a statistically significant result before deciding whether to collect more data. Or they might only report the results of “successful” experiments. These acts, known colloquially as p-hacking, are attempts to torture positive results out of ambiguous data. They may be done innocuously, but they flood the literature with snazzy but ultimately false “discoveries.”