I'm going to tell you a number that's too big to imagine: 430 billion. That's how many web pages have been captured and preserved by the weird, wonderful Wayback Machine since it launched in 1996.
I learned this from Jill Lepore's engrossing profile of the Internet Archive, printed in The New Yorker this week. (Actually, that number has ballooned to 452 billion, and it's always climbing.) I also learned that Internet Archive founder and Wayback Machine inventor Brewster Kahle once decided to squeeze the entire web into a shipping container. Here's how Lepore tells it:
I was on a panel with Kahle a few years ago, discussing the relationship between material and digital archives. When I met him, I was struck by a story he told about how he once put the entire World Wide Web into a shipping container. He just wanted to see if it would fit. How big is the Web? It turns out, he said, that it’s twenty feet by eight feet by eight feet, or, at least, it was on the day he measured it. How much did it weigh? Twenty-six thousand pounds. He thought that meant something. He thought people needed to know that.
Kahle put the Web into a storage container, but most people measure digital data in bytes. This essay is about two hundred thousand bytes. A book is about a megabyte. A megabyte is a million bytes. A gigabyte is a billion bytes. A terabyte is a million million bytes. A petabyte is a million gigabytes. In the lobby of the Internet Archive, you can get a free bumper sticker that says “10,000,000,000,000,000 Bytes Archived.” Ten petabytes. It’s obsolete. That figure is from 2012. Since then, it’s doubled.
Others, too, have endeavored to turn the web into something you can pick up and turn over with your hands. As of July 2013, a crowdsourced effort to print out the entire web had produced 10 tons of pages—the equivalent of three or four baby blue whales, as the Washington Post put it. "It’s a lot of paper. Yet it’s not even a sliver of the whole Internet."
The whole Internet is hardly something that can be counted or printed or put into a shipping container. And so far it's not even something that can be preserved, not comprehensively—not even close. But Kahle is trying.
“The Internet as most people now know it—Web-based and commercial—began in the mid-nineties," Lepore wrote. "Just as soon as it began, it started disappearing. And the Internet Archive began collecting it.”
You can read the rest of her story here.