Comparative Effectiveness Redux

By Megan McArdle

A propos of my post last week on asthma inhalers, an academic who asked not to be named wrote:

Maybe you should elaborate for some of your readers on the difficulty of arguing for null effects. In my experience, when people trying to get peer-reviewed science pubs arguing for null effects they do  multiple experiments or the strongest test possible (which seems most relevant here). This study seems to have done neither and instead subjected all inhaler patients to crappier inhalers by showing that  the mildest sufferers of asthma did not show a statistically significant difference in their relatively insensitive test.

This is a hugely important point.  And there are a lot of ways in which these tests seemed deliberately designed to "prove" that there was no difference:

  • Small sample:  the smaller the sample, the harder it is to find an adverse effect. That's why drugs like Vioxx made it to market:  distinguishing problems from background statistical noise needed a lot of patients.  I know more than one analyst who argues that medical studies are generally too small--because humans are so variable, they don't reliably pick up any but the strongest effects.  Hence the steady stream of articles proving that antioxidants will kill you/make you live forever/make you fat/make you thin/improve your singing voice/cause your fingers to fall off.
  • Short timespan.  One study ran for a year.  The rest were 6-8 weeks.
  • Only mild-to-moderate asthmatics included.  These asthmatics are generally well controlled, and don't have crises that often.  If you have a 20% increase in the number of crises over a year, but the asthmatics in your study only have a crisis once a week, it will be hard to distinguish that from statistical noise, espeically given the small samples and short timeframes.
  • The differences may be hard to quantify, and thus not show up in the study:  if your breathing gets 30% worse, the doctor can't tell unless he happens to have you on hand to measure when you're having an attack.  Again, if you only have an attack rarely, he probably won't.

A cursory look at some of the studies indicates that they didn't really show there's no difference; what they showed was that there was no difference that a) showed up in a lab in b) a small sample of c) the patients with the mildest disease over d) generally short timeframes.

Lest you think this is special pleading, I'm pretty much resigned to my CFC-fate.  But this sort of thing matters broadly.  The FDA used the lightest possible statistical test on a pretty important medication for millions of asthmatics.  Do you want Medicare denying your mother a possibly effective treatment for her otherwise terminal cancer with the same kind of test?

The most worrying thing here is the real possibility that the FDA got the result the EPA wanted.  Will they be tempted to get the answer Medicare would like to hear about the relative merits of expensive medications?

Again, I'm not saying we shouldn't do CRE.  But for all that Democrats are enjoying thinking of themselves as the Party of Guys in White Coats With The Answers, the binary discussion of CRE (we'll find what works!) is borderline religious in the way it treats government researchers.  The process of finding out what works is considerably more complicated than giving a scientist some money and a hundred human lab rats.  And there is a real danger that a few studies will end up shutting down potentially useful treatments, as first Medicare and then private insurers turn weak or equivocal results into an iron ruling.

This article available online at:

http://www.theatlantic.com/business/archive/2009/05/comparative-effectiveness-redux/17956/