Department of Awful Statistics

More

Bill Easterly has a good post on bad infant mortality stats:

Of the 193 countries covered in the study, the researchers were able to use actual, reported data for only 33. To produce the estimates for the other 160 countries, and to project the figures backwards to 1995, the researchers created a sophisticated statistical model. [1]

What's wrong with a model? Well, 1) the credibility of the numbers that emerge from these models must depend on the quality of "real" (that is, actual measured or reported) data, as well as how well these data can be extrapolated to the "modeled" setting ( e.g. it would be bad if the real data is primarily from rich countries, and it is "modeled" for the vastly different poor countries - oops, wait, that's exactly the situation in this and most other "modeling" exercises) and 2) the number of people who actually understand these statistical techniques well enough to judge whether a certain model has produced a good estimate or a bunch of garbage is very, very small.

Without enough usable data on stillbirths, the researchers look for indicators with a close logical and causal relationship with stillbirths. In this case they chose neonatal mortality as the main predictive indicator. Uh oh. The numbers for neonatal mortality are also based on a model (where the main predictor is mortality of children under the age of 5) rather than actual data.

So that makes the stillbirth estimates numbers based on a model...which is in turn...based on a model.

In many parts of the world, data is hard to come by.  Unfortunately, voters and donors demand data . . . and when it can't be collected effectively, researchers under heavy pressure to come up with numbers are forced to use alternative methods.

But publishing a number is dangerous: the caveats will be stripped off by an innumerate media, and even if they were left in, the public won't understand what they mean.  When the quality of the data is really bad, the public is left less informed than it was before the number was published.  I wrote about that problem in regard to Iraq mortality statistics a few years ago, but it's much broader than conflict epidemiology.  At some point, we're better of knowing that we don't know.
Jump to comments
Presented by

Megan McArdle is a columnist at Bloomberg View and a former senior editor at The Atlantic. Her new book is The Up Side of Down.

Get Today's Top Stories in Your Inbox (preview)

CrossFit Versus Yoga: Choose a Side

How a workout becomes a social identity


Join the Discussion

After you comment, click Post. If you’re not already logged in you will be asked to log in or register. blog comments powered by Disqus

Video

CrossFit Versus Yoga: Choose a Side

How a workout becomes a social identity

Video

Is Technology Making Us Better Storytellers?

The minds behind House of Cards and The Moth weigh in.

Video

A Short Film That Skewers Hollywood

A studio executive concocts an animated blockbuster. Who cares about the story?

Video

In Online Dating, Everyone's a Little Bit Racist

The co-founder of OKCupid shares findings from his analysis of millions of users' data.

Video

What Is a Sandwich?

We're overthinking sandwiches, so you don't have to.

Video

Let's Talk About Not Smoking

Why does smoking maintain its allure? James Hamblin seeks the wisdom of a cool person.

Writers

Up
Down

More in Business

Just In