One Easy Way to Make Wikipedia Better

Researchers say the online encyclopedia should have a source-o-meter on each page, reflecting the quality of citations.

The main chamber of the Old Library at Trinity College, in Dublin (Tony Webster / Flickr)

The Pacific Northwest tree octopus existed years before Wikipedia was founded. I’m using “existed” loosely here, of course, because there’s no such thing as a Pacific Northwest tree octopus.

The species, also known as Octopus paxarbolis, was invented as a joke in 1998, and got passed along the Internet as real. And though the species propelled itself across the web for three years before Wikipedia launched, its legacy as a cautionary tale about the integrity of information online is a particularly handy parable for  critical thinking in the Wikipedia age. (Or any age for that matter.)

Wikipedia isn’t perfect. Accuracy can be dicey in a digital environment where anyone can write and edit articles. Plenty of academic studies have concluded as much. But research also shows us that Wikipedia is pretty darn good—in some cases comparable to the quality of Encyclopaedia Britannica and its peers. Wikipedia’s robust policy on citations means that anyone with enough time on their hands could, theoretically, vet any given page for accuracy fairly easily. Enough time—and also proper access to obscure texts, academic journals, paywalled newspapers, and any other hard-to-reach sources that frequently show up in Wikipedia citations.

It turns out it’s pretty difficult to fully access key sources on any given Wikipedia page. That’s according to a study from researchers at Dartmouth’s Neukom Institute who assessed the 5,000 most-trafficked Wikipedia pages, analyzing them for verifiability. In other words, they didn’t check to see if Wikipedia pages were accurate; they investigated how easily someone could make that determination for themselves.

Based on the presence of markers like International Standard Book Number and Digital Object Identifiers, which are unique serial codes assigned to books and papers, about 80 percent of book citations and nearly 90 percent of journal references were technically verifiable—meaning you could track down the source material if you wanted to figure out whether a characterization on Wikipedia was right. But practical verifiability was a different story. It might be possible to track down the source material—as in, that source material actually exists and the link to get you there is working—but it might be really difficult or impossible to get to it. Using Google’s API, the researchers wrote a program to classify the accessibility of Google Books citations, for instance, and found that most books (71 percent) cited on Wikipedia are only partially viewable online; while many others (17 percent) are not viewable online at all. (About 12 percent were fully viewable.)

“A lot of references are made available. But when you try to track them down, the main problem you run into is not that they’re fake or erroneous, but you can’t get to them,” said Michael Evans, a research fellow at the Neukom Institute and one of the study’s co-authors. “Typically it’s because of paywalls. Sometimes it’s because of link rot.”

“The point is, basically, that Wikiepdia is not bad,” Evans added, “But it needs to meet its own standard for verifiability.”

Evans and his colleagues have an idea for how Wikipedia could begin to do this—and it’s a proposal that, if executed well, could dramatically improve access to information on the Internet. “You could just give some kind of meter about verifiability, actually on the Wikipedia page,” said Dan Rockmore, the director of the Neukom Institute and a co-author of the study. “That could be automated in a fairly simple way.”

He and Evans envision a browser plug-in, for instance, that would run a quick script to assess a Wikipedia page’s citations; then translate its findings into some sort of prominent verifiability scoring system displayed on the page. Such a metric could—perhaps with “smiley and frowny emoticons,” Rockmore offered—warn people about pages with low-verifiability ratings, or add credence to easy-to-vet pages. Such a scoring system would incentivize sourcing articles with information that’s easy for people to check online—and could be used on basically any website that includes lots of citations. (News sites seem like one natural candidate.)

Rockmore and Evans like to think that a verifiability meter would also push publishers and other information gatekeepers to do more to improve access to the thinking and research that shapes their work. In an ideal world, they say, everything would be freely available online. It’s a nice idea, but maybe not a realistic one: The debate over open access often comes back to questions of economics. Sure, giving everyone on the planet direct and unfettered access to every article, research paper, and book in existence would be wonderful. But, in practice, such an effort would take an enormous amount of money and time on the part of the producers and keepers of informational resources. “There’s this idea that open access is this ethical and moral thing, that it’s a morally and ethically grounded movement, and I can appreciate in a sense that it is,” Melissa Bates, a physiology researcher at the University of Iowa, told The Atlantic in 2014. “But there’s also a business model to how science is done.”

Which means, publishers must have economic incentive—not just intellectual incentive—to make their work available to all. Wikipedia’s role in all this, Rockmore argues, should be to encourage a shift to a more verifiable informational world. Obviously Wikipedia can’t force newspapers and medical journals to change their business models, but a prominently placed meter like the one he and Evans envision would be a first step toward realizing that kind of accountability. “Ultimately, people do want a one-stop shopping place for information, and that’s kind of what Wikipedia is becoming,” he said. “I continue to feel that they need to take on a bigger responsibility than they already are.”

“In many dystopic versions of the future, it’s the information-access people who control everything,” he added. “When you recognize your market position in the information world, there’s sort of a moral shift that a responsible, socially-minded, honest futurist would think about. If we’re guiding the way people are using, acquiring, and accessing information, shouldn’t we be thinking all the time about how to do that as a social good?”