Fundamental changes are afoot at Wikipedia. Changes that have worrying connotations for the diversity of knowledge in the world's sixth most popular website.
Wikipedia, with a new initiative called Wikidata, is radically reconfiguring itself to take advantage of the "Semantic Web." Wikidata will create a collaborative database that is both machine readable and human editable and which will underpin a lot of knowledge that is presented in all 284 language versions of Wikipedia.
In other words, the encyclopaedia plans to become part of the movement from a mostly human-readable Web to a Web in which computers and software can better make sense of information.
This system becomes especially useful for facts that are embedded in a variety of pages. If Mitt Romney were to become President of the United States, there would be hundreds or thousands of pages in all of the language versions of Wikipedia that would need to be altered to reflect that fact. Wikidata would allow all of those references to be immediately updated after only one change in the central Wikidata repository.
This is a highly significant and hugely important change to the ways that Wikipedia works. Until now, the Wikipedia community has never attempted any sort of consistency across all languages.
Look, for instance, at the Wikipedia pages about the Bronze Statue of Tallinn (a highly controversial moment in Estonia's history that sparked one of the world's first 'cyberwars' between Russia and Estonia). The Estonian and Russian versions of that article present interestingly different versions of the very same place and events. The Arabic and Hebrew articles about Hezbollah offer perhaps an even starker contrast of the ways in which different communities of editors agree on different types of representation and truths.
Research carried out independently by Brent Hecht, myself, and others has found that each language edition of Wikipedia represents encyclopaedic knowledge in highly diverse ways. Not only does each language edition include different sets of topics, but when several editions do cover the same topic, they often put their own, unique spin on the topic. In particular, the ability of each language edition to exist independently has allowed each language community to contextualize knowledge for its audience.
It is important that different communities are able to create and reproduce different truths and worldviews. And while certain truths are universal (Tokyo is described as a capital city in every language version that includes an article about Japan), others are more messy and unclear (e.g. should the population of Israel include occupied and contested territories?).
The reason that Wikidata marks such a significant moment in Wikipedia's history is the fact that it eliminates some of the scope for culturally contingent representations of places, processes, people, and events. However, even more concerning is that fact that this sort of congealed and structured knowledge is unlikely to reflect the opinions and beliefs of traditionally marginalized groups.
We know that Wikipedia is a highly uneven platform. We know that not only is there not a lot of content created from the developing world, but there also isn't a lot of content created about the developing world. And we also, even within the developed world, a majority of edits are still made by a small core of (largely young, white, male, and well-educated) people. For instance, there are more edits that originate in Hong Kong than all of Africa combined; and there are many times more edits to the English-language article about child birth by men than women.