The Promise and Limits of Google's 'Data Liberation Front'

A 'simple' business decision has complex and still-unfolding effects.

I figure I might as well go all-in on this topic. Previous entries here and here. Today, two more reader dispatches concerning which parts of your data you can and can't retrieve if a cloud service you'd relied on is turned off -- as Google has recently done with Reader and many of its other services.

1) Making backups of Gmail. I mentioned yesterday that, to Google's credit, "It has been a leader in making sure you could make your own copies, or extract, any of your info that was in its part of the cloud." A reader writes:

There is a notable exception: Gmail. You can download all your mail via POP3 or IMAP but Google throttles the download speed:

I decided to create a local backup of my mails at Gmail a few weeks ago and have managed to download around 100,000 mails so far, i.e., the throttling is probably more restrictive than Google mentions. In the best case according to the information provided by Google, it would still take me around a week to download all my mail from Gmail via IMAP. There is therefore no efficient way to migrate your Gmail account to another mail provider. And if you want to keep all your mails and all your labels, you get each mail at least twice ('all mails' or 'sent' plus label).

I am still happy that the Google Data Liberation Front shows new signs of life after it had looked abandoned for years. At least for Gmail, however, it is not that useful. And I hope in any case that I do not have to migrate away from Gmail (or Google Apps for Business in my case)

I have made piecemeal IMAP archives of my Gmail cloud archives over the years (via Thunderbird and also Apple Mail), rather than trying to do it at one go, so I had not noticed the restrictions the reader mentions. But they're worth bearing in mind. Of course, Gmail is so central a part of Google's offerings, and of its burgeoning for-pay business apps, that it is hard to imagine Gmail ever being turned off as a conscious business decision. Still, I feel better having my own backups, just in case.

2. 'BOOOOOOOO!' for Downstream Effects. A reader in the tech biz writes:

I think the move to paid services is ultimately probably wise for all of us info junkies, except that the inclusion of RSS functionality on any given site may be hampered by the lack of Google's huge availability. Why code for it, or more to the point, provide it at all if even the dismissing eyeballs that reader provided are no longer there?

One other side effect that has been under-discussed but that tails off of the google data liberation front (what a stellar name, it should be a not-quite-ready-for-prime time Brooklyn band name):  the history of pages you've already read is stored in reader but I don't believe it exports with your feeds.

I can, today, search in reader for a lifehacker article about a shelving system that I thought about building during a bout of DIYness. That resource will be gone to me. Or a homegrown revolution post about a wild-cherry growing system that is self-sustaining. Or a Sullivan post that I have been meaning to email him about for two plus years. Or a Fallows post on civil aviation and comparisons to ask the pilot posts from four years ago before salon ruined their RSS feed system and I stopped reading salon.

Come to think of it, that's the biggest fear. When salon ruined their RSS feeds I went from 9 individual feeds i cleared daily to zero (RIP How the World Works, one of my favorite blogs of all time). Ditto for wonkette recently when they went to teaser RSS posts instead of full entries - I can't open posts calling John McCain "walnuts" and accusing him of senility at work! This angle is the Iran-news-access-by-proxy service google was providing through reader (not to compare my circumstances to those of the oppressed in Iran).

As we say in App Dev, the downstream effects and the lack of even a frozen "legacy system" for historical purposes are severe and, worse, not clear.


I'm not going to bother with a paraphrasing of these posts for readers not involved in the computer world, since people most affected by these changes are likely to understand the arguments as presented above. I will say that what must have seemed to Google a simple, clear-cut business choice -- let's stop messing around with the "interesting" little diversions and concentrate on our mainstream products -- is having more complex "downstream effects" than most people might have foreseen.