ClimateGate III: The Mystery of the Missing Data

Over the weekend, I came in for some probably deserved criticism from Clive Crook over my initial, somewhat airy, reaction to Climategate.  In my defense, he quotes my first post on the topic, not the follow up.  That was early innings, and my initial estimation of the emails that got the most press at the beginning--particularly the "trick" email--hasn't changed all that much.  Sexing up a graph is a bad thing.  But the world is not going to plunge off a cliff because of one overdone graph.  I've become considerably more concerned at items that have subsequently gotten more attention.

Clive says:

Megan McArdle adopts a world-weary tone similar to The Economist's: this is how science is done in the real world. If I were a scientist, I would resent that. She has criticised the emails and the IPCC response to them, then says she still believes the consensus view on climate change. Well, that was my position at the end of last week, and I suppose it still is. But how do I defend it? There is far more of a problem here for the consensus view than Megan and ordinarily reliable commentators like The Economist acknowledge. I am not a climate scientist. In the end I have to trust the experts. That is what we are asked to do. "Trust us, we're scientists".

He is right:  this is not how science should be done.  Bullying, groupthink, and bad behavior take place, even by scientists who are right--but that is not to say that I approve of it.  And I confess, some of the revelations are making it harder for me to trust this group of scientists about the magnitude of the change, even though I am still pretty confident about the direction.

  1. They apparently tried to organize a deletion of files in order to avoid an FOI request.  This is horrifying, and I simply cannot understand why so many of their supporters are willing to downplay it.  A couple of sample quotes:  "Unfortunately, there are also a couple of messages that suggest an effort to destroy emails that might have been subject to a Freedom of Information request.  That's a genuine problem, though it's not clear to me just how big a problem it is. . . . So on a substantive level, there's really very little to this." that's from Kevin Drum, who I greatly respect.  More worrying is Real Climate:  "Suggestions that FOI-related material be deleted ... are ill-advised even if not carried out. What is and is not responsive and deliverable to an FOI request is however a subject that it is very appropriate to discuss."  

    Words fail one, reading that latter comment.  Ill-advised? Deleting data in order to avoid an official information request is a crime, as is trying to coordinate same, even if you fail in the execution.  It's also grossly unethical, and hard to reconcile with any reasonable understanding of science.  Moreover, it's the sort of thing that is often done by people who have nasty secrets, so it's hard to pass it off with a blithe, "Oh, dear, now that was a wee bit naughty!"

    Imagine reading this email exchange coming from, say, senior officials in the Bush administration.  Would any of these bloggers regard this as the ethical equivalent of jaywalking on an empty street?  

    It's entirely possible that the aspiring self-censors were merely trying to avoid some trivial embarrassment, since we have no idea what, if anything, was actually deleted.  But it does not inspire the kind of trust you want to have in people who are advocating massive economic dislocations. 
  2. There is strong evidence that a small group of scientists has inappropriate power over the process of consensus-building.  Particularly, they seem to have exercised considerable sway over the peer review process at prominent outlets, while simultaneously deriding their critics because . . . they weren't being published in those peer reviewed journals.  As Derek Lowe says, "But while it may have happened somewhere else, that does not make it normal (and especially not desirable) scientific behavior. This is not a standard technique by which our sausage is made over here."   

    As with the FOI deletions, I find the defenses incredibly underwhelming.  They boil down to the fact that these scientists were sick of answering criticisms from people they don't like.  And I am actually sympathetic.  Corporate groups and conservative interests did put a lot of money into battling any evidence of anthropogenic global warming, for reasons that had very little to with a commitment to solid science.  Having gone more than a few rounds with critics like this, I heartily empathize with the weariness.  But unfortunately, having people you don't like crawl all over your work looking for errors is, er, science.  When I come across scientists who don't get that, well, my trust in their work sort of plummets.  Contrast this with something like Steven Levitt's classy response when he was caught in coding errors.

  3. It is not clear to me that CRU can now reproduce their own data set.  There has been a lot of bad reporting on this, including by me--based on early reports, I originally thought that the problem was with their models, rather than the data underlying them.  A number of other people have reported that the data have vanished, or triumphantly refuted this false claim as if it ended the story.  The truth is more prosy.  CRU still has a database.  What was lost is the source data.  But it is not accurate to say, as they have claimed, that this isn't a problem because you can always go back and get the source data from the original agencies.  I mean, it's great news that there wasn't some cataclysmic tragedy that caused us to lose all our original records of global surface temperatures over the last 150 years.  But it still raises questions about the integrity of their data.

    Assembling a data set from many disparate sources is a massive process.  You have to normalize the data so that the records are equivalent, which often means tossing out some records.  In the case of climate records, you also have to apply substantial corrections to the raw temperature records, to account for various factors--as I understand it, this means things like equipment malfunction and changes in the surrounding area, as well as normalizing the data so that it looks more like a grid. It is not adequate to say that people can go get the raw data elsewhere, because first of all, that effort is enormous, and second of all, that still wouldn't tell them how you standardized the data, which is the controversial bit they want to look into.

    Bad enough that they won't share, but some of the dumped documents, and this story from the Times, imply that they can't. The now-infamous "Harry_README" file seems to chronicle the efforts of a programmer to figure out how one of the earlier data sets was assembled from the raw data. He eventually gives up in despair, because they seem to have exercised extraordinarily poor source control.

    When trillions of dollars worth of global economic growth are riding on models that are built using your data, it seems sort of elementary to keep a copy of the raw data, and a record of what you did to it. I don't want to sound like some naive pundit holding scientists to an impossible standard of perfection--certainly, real world data sets all have their flaws.  But the less record we have of how a particular data set was created, the less reliable we consider that data to be.  If it is true that they cannot reproduce their own data set from scratch, then I think they have been claiming an inappropriately high degree of reliability, as have any models or analyses constructed around these figures.

That said, there are a bunch of things I don't think.  I don't think that this proves that AGW is all bosh--there are other data sets that generate roughly similar results, though I believe CRU's is the most comprehensive.  I do not think that we are seeing evidence of a conspiracy to fabricate data.  I see little that has direct bearing on the various disputes over the "hockey stick" and other graphs.  Rather, I see an indirect problem, which is that these scientists allowed themselves to become politicized and hostile to outsiders in a way that may have compromised the quality of their work.  This is probably best summed up by Will Wilkinson:

The idea that the science behind predictions of potentially catastrophic warming is rock solid and that the putative scientific consensus reflects the rock solidity of the science licenses the inference that there is no scientifically respectable excuse for skepticism of or disagreement with the consensus. That is a big stick to thump people with. But the Climategate files strongly suggest that at least some of the science is not rock solid and that the scientific consensus is at least in part the product of silencing or marginalizing those who might upset it. The files have made "How can we be sure that you did not fudge your data" and "How do we know that dissenting voices have been given a fair hearing?" questions that we now must ask rather than questions skeptics can be effectively shouted down for asking. The files show that suspicion is warranted. That's a big deal.

It is not surprising to see a "Move along! Nothing to see here!" response from alarmists, but there is certainly something to see. Though I'm sure some ideologues will merely amp up their armtwisting thug tactics to protect the fragile perception of consensus they had achieved (precioussssssss!), I predict that the overall response from the scientific community will be healthy and invigorating. Climate science will become more transparent and more rigorously by-the-book because climate scientists are becoming more fully aware that the impulse to jealously protect a public perception of consensus can undermine itself by producing questionable science and a justifiably skeptical public.

There's been substantial evidence of what Will calls "motivated cognition" from both skeptics and advocates--for most people, this seems like some sort of a Rorschach blot that shows you whatever you already believed.  I don't think I've fallen into that particular trap, because this definitely is not what I believed about major climate scientists a month ago.  But obviously, I am prey to other errors, not least the fact that I am still quite dependent on information from very motivated and often angry experts on both sides.