Look, for instance, at the Wikipedia pages about the Bronze Statue of Tallinn (a highly controversial moment in Estonia's history that sparked one of
the world's first 'cyberwars' between Russia and Estonia). The Estonian and Russian versions of that article present interestingly different versions
of the very same place and events. The Arabic and Hebrew articles about Hezbollah offer perhaps an even starker contrast of the ways in which different
communities of editors agree on different types of representation and truths.
Research carried out independently by Brent Hecht, myself, and others has found that each language edition of Wikipedia represents encyclopaedic knowledge in highly
diverse ways. Not only does each language edition include different sets of topics, but when several editions do cover the same topic, they often put
their own, unique spin on the topic. In particular, the ability of each language edition to exist independently has allowed each language community to
contextualize knowledge for its audience.
It is important that different communities are able to create and reproduce different truths and worldviews. And while certain truths are universal
(Tokyo is described as a capital city in every language version that includes an article about Japan), others are more messy and unclear (e.g. should
the population of Israel include occupied and contested territories?).
The reason that Wikidata marks such a significant moment in Wikipedia's history is the fact that it eliminates some of the scope for culturally
contingent representations of places, processes, people, and events. However, even more concerning is that fact that this sort of congealed and
structured knowledge is unlikely to reflect the opinions and beliefs of traditionally marginalized groups.
We know that Wikipedia is a highly uneven platform. We know that not only is there not a lot of content created from the developing world, but there
also isn't a lot of content created about the developing world. And we
also, even within the developed world, a majority of edits are still made by a small core of (largely young, white, male, and well-educated) people.
For instance, there are more edits that originate in Hong Kong than all of Africa
combined; and there are many times more edits to the English-language article about child birth by men than women.
What does this mean for structured, semantic data in Wikipedia? If we start to rely on a singular source for our truths, it will undoubtedly in most
cases make most articles more accurate and current. But it also means that in contested cases, we will likely see an even more vivid reinforcement of
existing core/periphery inequalities of knowledge production. A disagreement over facts or data would no longer be confined to a specific article and
language, but would most likely have to be conducted in English to an unfamiliar community of editors.