Missing Links: Access to Research Papers' Raw Data Drops 17% a Year

The older science research is, the more we lose of its original sources.

Current Biology via Nature
Where have you stored the records of your life? Some of it, most likely, you've outsourced: your bank statements, your medical records, your grades. Some of it, however, you've likely left to yourself. Which leads to a kind of inevitable decline in the physical evidence of our own existence. Lost teeth, baby clothes, early diaries, that eighth-grade paper on Abraham Lincoln—these are records of ourselves that we tend to shed along the way. Boxes get lost. Papers get pruned. The past is abandoned for just what it is. This is unfortunate, perhaps, but—in a world, at least, in which the storage of things take space and effort—inevitable.
You'd think it would be different for scientists. They collect their data, after all, not just on behalf of themselves, but of all of us.
As it turns out, though, it is not different for scientists.
Nature reported today on a study, newly published in the journal Current Biology, that tracked the raw data scientists have gathered that inform the conclusions they reach in their published papers. It was a treasure hunt for the past, basically: The large team of researchers looked for the data that informed 516 papers that were published between 1991 and 2011 in the field of ecology. (The researchers, Nature notes, "selected studies that involved measuring characteristics associated with the size and form of plants and animals, something that has been done in the same way for decades.")
The data-hunters' first task was to get in touch with the papers' authors. They were able to do so only in an astoundingly low 37 percent of cases. Which was in part because of the rapid evolution of contact information: "The likelihood of being able to find a working e-mail address, even after an extensive online search, declined by 7 percent per year," Nature writes. The even bigger problem, however, was more human in scope: Only about 50 percent of the authors who still had valid addresses responded to the researchers' requests—and that was regardless of the age of their papers.
And when the researchers were able to get in touch with the authors, their discovery was even more dire: While data for almost all of the studies published as recently as 2011 were still accessible, the chances of them remaining accessible fell by a whopping 17 percent each year. Each year. For research from the not-that-distant early 1990s, data availability dropped to as little as 20 percent.
This phenomenon will feel familiar to anyone who follows science, academia, and the law. You may know it as "link rot." Or as, relatedly, "the half-life of facts." Or as that thing that happened to your Abraham Lincoln term paper. Nothing about the data decline is limited to the analog (or, for that matter, the digital) world. A recent study of Supreme Court decisions—the rhetorical documents of the highest judicial body in the land—found that 49 percent of the web links cited within those decisions are now dead.
"We shed as we pick up," Tom Stoppard had it, "like travelers who must carry everything in their arms, and what we let fall will be picked up by those behind."
Except when it is not.