Whoa: 22% of *All the World's Web Pages* Reference Facebook

Is the web "turning upside-down"?

[optional image description]
Matthew Berk

Revealing stat of the day: 22 percent of web pages contain Facebook URLs. At this point, in other words, more than a fifth of all web pages in the world -- 242 million of 1.3 billion -- reference the Mark Zuckerberg Production.

This is per an analysis conducted by the researcher Matthew Berk, who used data for the project gathered by Common Crawl, a Google-type web-crawling tool. The data accounted for the nearly 1.3 billion URLs the tool has crawled so far in 2012. And Berk's findings in analyzing them, he writes, offer some evidence for what he calls "the web turned upside down" -- a reweaving of the fabric of the World Wide Web based on social connections. The "structure and metaphors" of the planet's social graph, Berk argues, "will eventually reshape the Web into something completely different."

Among the evidence he marshals to make that case:

  • There are more than 471 million total Facebook URLs.
  • Of some 500 million hardcoded links to Facebook, only 3.5 million -- 0.7 percent -- are unique.
  • The like button is the top Facebook URL, accounting for nearly 16 percent of the total.

Perhaps Berk's most interesting finding, though, is that 7.5 percent of the websites he analyzed use Facebook's open graph tags in their pages.


"This is a deeper level of integration," Berk notes -- one that will help make Facebook more indexable and visible to the web at large. "Much the same way that the Google Toolbar and its caching mechanism gave the search giant live glimpses of the Web as it was consumed by people," he writes, "these snippets effectively position the Web as a live, visible extension of the entities Facebook is seeking to have users define (through pages and applications)."

Facebook and the open web: increasingly one giant, tentacular megabeast.

So the takeaway of Berk's study isn't just Facebook's current, huge influence on the web (although that's definitely part of it -- one fifth of all pages! around the world!). The takeaway is also -- and maybe more so -- the power of Facebook's infrastructure as it's thus far been integrated into the web. And the takeaway is also the social and structural and intellectual implications of that synthesis. Again, Berk:

Increasingly, people and organizations will seek to write themselves not to websites, but to the big "platforms" (APIs) like Facebook and Twitter. And more and more, websites are being rewoven into those social networks, whether by simple inclusions of like or +1 buttons, or through more complex reflections of social connection.

When I look at the data, it's pretty amazing. On Lucky Oyster, which is an alpha application for social discovery, it's not uncommon for an occasional user of Facebook with only 20 friends to be intricately connected with upward of 20,000 entities. Active users with around 1,000 friends are consistently connected to well over 100,000 entities. I believe that reading this graph is different than reading the Web; it requires both a new mental model and new technology.

A new mental model and a new technology -- which is, ironically, exactly what the web has been. So far.